**Signals and Communication Technology**

Martin Tomlinson Cen Jung Tjhai Marcel A. Ambroze Mohammed Ahmed Mubarak Jibril

# Error-Correction Coding and Decoding

Bounds, Codes, Decoders, Analysis and Applications

# Signals and Communication Technology

More information about this series at http://www.springer.com/series/4748

Martin Tomlinson • Cen Jung Tjhai Marcel A. Ambroze • Mohammed Ahmed Mubarak Jibril

# Error-Correction Coding and Decoding

Bounds, Codes, Decoders, Analysis and Applications

Martin Tomlinson School of Computing, Electronics and Mathematics Plymouth University Plymouth, Devon UK

Cen Jung Tjhai PQ Solutions Limited London UK

Marcel A. Ambroze School of Computing, Electronics and Mathematics Plymouth University Plymouth, Devon UK

Mohammed Ahmed School of Computing, Electronics and Mathematics Plymouth University Plymouth, Devon UK

Mubarak Jibril Satellite Applications and Development Nigeria Communications Satellite Limited Abuja Nigeria

ISSN 1860-4862 ISSN 1860-4870 (electronic) Signals and Communication Technology ISBN 978-3-319-51102-3 ISBN 978-3-319-51103-0 (eBook) DOI 10.1007/978-3-319-51103-0

Library of Congress Control Number: 2016963415

© The Editor(s) (if applicable) and The Author(s) 2017. This book is published open access. Open Access This book is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this book are included in the book's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the book's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use.

The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, express or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Printed on acid-free paper

This Springer imprint is published by Springer Nature The registered company is Springer International Publishing AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland This book is dedicated to our families and loved ones.

# Preface

The research work described in this book is some of the works carried out by the authors whilst working in the Coding Group at the University of Plymouth, U.K. The Coding Group consists of enthusiastic research students, research and teaching staff members providing a very stimulating environment to work. Also being driven by academic research, a significant number of studies were driven by the communications industry with their many varying applications and requirements of error-correcting codes. This partly explains the variety of topics covered in this book.

Plymouth, UK Martin Tomlinson London, UK Cen Jung Tjhai Plymouth, UK Marcel A. Ambroze Plymouth, UK Mohammed Ahmed Abuja, Nigeria Mubarak Jibril

# Acknowledgements

We would like to thank all of our past and present research students, our friends and fellow researchers around the world who have helped our understanding of this fascinating and sometimes tricky subject. Special thanks go to our research collaborators Des Taylor, Philippa Martin, Shu Lin, Marco Ferrari, Patrick Perry, Mark Fossorier, Martin Bossert, Eirik Rosnes, Sergey Bezzateev, Markus Grassl, Francisco Cercas and Carlos Salema. Thanks also go to Dan Costello, Bob McEliece, Dick Blahut, David Forney, Ralf Johannason, Bahram Honary, Jim Massey and Paddy Farrell for interesting and informed discussions. We would also like to thank Licha Mued for spending long hours editing the manuscript.

# Contents

#### Part I Theoretical Performance of Error-Correcting Codes




#### Part II Code Construction









# Acronyms



# **Part I Theoretical Performance of Error-Correcting Codes**

This part of the book deals with the theoretical performance of error-correcting codes. Upper and lower bounds are given for the achievable performance of error-correcting codes for the additive white Gaussian noise (AWGN) channel. Also given are bounds on constructions of error-correcting codes in terms of normalised minimum distance and code rate. Differences between ideal soft decision decoding and hard decision decoding are also explored. The results from the numerical evaluation of several different code examples are compared to the theoretical bounds with some interesting conclusions.

# **Chapter 1 Bounds on Error-Correction Coding Performance**

#### **1.1 Gallager's Coding Theorem**

The sphere packing bound by Shannon [18] provides a lower bound to the frame error rate (FER) achievable by an (*n*, *k*, *d*) code but is not directly applicable to binary codes. Gallager [4] presented his coding theorem for the average FER for the ensemble of all random binary (*n*, *k*, *d*) codes. There are 2*<sup>n</sup>* possible binary combinations for each codeword which in terms of the *n*-dimensional signal space hypercube corresponds to one vertex taken from 2*<sup>n</sup>* possible vertices. There are 2*<sup>k</sup>* codewords, and therefore 2*nk* different possible random codes. The receiver is considered to be composed of 2*<sup>k</sup>* matched filters, one for each codeword and a decoder error occurs if any of the matched filter receivers has a larger output than the matched filter receiver corresponding to the transmitted codeword. Consider this matched filter receiver and another different matched filter receiver, and assume that the two codewords differ in *d* bit positions. The Hamming distance between the two codewords is *<sup>d</sup>*. The energy per transmitted bit is *Es* <sup>=</sup> *<sup>k</sup> <sup>n</sup> Eb*, where *Eb* is the energy per information bit. The noise variance per matched filtered received bit, <sup>σ</sup><sup>2</sup> <sup>=</sup> *<sup>N</sup>*<sup>0</sup> 2 , where *N*<sup>0</sup> is the single sided noise spectral density. In the absence of noise, the output of the matched filter receiver for the transmitted codeword is *n* <sup>√</sup>*Es* and the output of the other codeword matched filter receiver is (*n* − 2*d*) <sup>√</sup>*Es*. The noise voltage at the output of the matched filter receiver for the transmitted codeword is denoted as *nc* − *n*1, and the noise voltage at the output of the other matched filter receiver will be *nc* + *n*1. The common noise voltage *nc* arises from correlation of the bits common to both codewords with the received noise and the noise voltages −*n*<sup>1</sup> and *n*<sup>1</sup> arise, respectively, from correlation of the other *d* bits with the received noise. A decoder error occurs if

$$(n - 2d)\sqrt{E\_s} + n\_c + n\_1 > n\sqrt{E\_s} + n\_c - n\_1\tag{1.1}$$

that is, a decoder error occurs when 2*n*<sup>1</sup> > 2*d* <sup>√</sup>*Es*.

© The Author(s) 2017 M. Tomlinson et al., *Error-Correction Coding and Decoding*, Signals and Communication Technology, DOI 10.1007/978-3-319-51103-0\_1

The average noise power associated with *<sup>n</sup>*<sup>1</sup> is dσ<sup>2</sup> <sup>=</sup> <sup>d</sup> *<sup>N</sup>*<sup>0</sup> <sup>2</sup> and as the noise is Gaussian distributed, the probability of decoder error, *pd* , is given by

$$p\_d = \frac{1}{\sqrt{\pi d N\_0}} \int\_{4\sqrt{E\_s}}^{\infty} \mathbf{e}^{\frac{\mathbf{r}^2}{dN\_0}} \mathbf{dx} \tag{1.2}$$

This may be expressed in terms of the complementary error function (erfc)

$$\text{erfc}(\mathbf{y}) = 2 \frac{1}{\sqrt{2\pi}} \int\_{\mathbf{y}}^{\infty} \mathbf{e}^{\frac{-\mathbf{y}^2}{2}} \mathbf{dx} \tag{1.3}$$

and

$$p\_d = \frac{1}{2} \text{erfc}\left(\sqrt{d \frac{k}{n} \frac{E\_b}{N\_0}}\right) \tag{1.4}$$

Each of the other 2*<sup>k</sup>* − 2 codewords may also cause a decoder error but the weight distribution of the code *C<sup>i</sup>* is usually unknown. However by averaging over all possible random codes, knowledge of the weight distribution of a particular code is not required. The probability of two codewords of a randomly chosen code *C<sup>i</sup>* , differing in *d* bit positions, *p*(*d*|*Ci*) is given by the binomial distribution

$$p(d|\%) = \frac{\binom{n}{d}}{2^n},\tag{1.5}$$

where *<sup>a</sup> b* <sup>=</sup> *<sup>a</sup>*! (*a* − *b*)!*b*! . A given linear code *C<sup>i</sup>* cannot have codewords of arbitrary weight, because the sum of a subset of codewords is also a codeword. However, for non linear codes, *pd* may be averaged over all of the codes without this constraint. Thus, we have

$$\overline{p\_{\mathcal{C}}} = \sum\_{i=1}^{2^{n^k}} p(d|\theta\_i)p(\theta\_i^{\mathcal{O}}) < \frac{1}{2^{n^2}} \sum\_{d=0}^{n} \sum\_{i=1}^{2^{n^k}} \frac{\binom{n}{d}}{2^{n+1}} \text{erfc}\left(\sqrt{d \frac{k}{n} \frac{E\_b}{N\_0}}\right) \tag{1.6}$$

Rearranging the order of summation

$$\overline{p\_{\mathbf{C}}} < \frac{1}{2^{n2^k}} \sum\_{i=1}^{2^{n2^k}} \sum\_{d=0}^{n} \frac{\binom{n}{d}}{2^{n+1}} \text{erfc}\left(\sqrt{d \frac{k}{n} \frac{E\_b}{N\_0}}\right) \tag{1.7}$$

and

$$\overline{p\_{\mathbf{C}}} < \frac{1}{2^{n+1}} \sum\_{d=0}^{n} \binom{n}{d} \text{erfc}\left(\sqrt{d \frac{k}{n} \frac{E\_b}{N\_0}}\right). \tag{1.8}$$

Remembering that any of the 2*<sup>k</sup>* − 1 matched filters may cause a decoder error, the overall probability of decoder error averaged over all possible binary codes *p*overall, is

$$
\overline{p\_{\text{overall}}} = 1 - \left(1 - \overline{p\_{\text{C}}}\right)^{2^k - 1} < 2^k \overline{p\_{\text{C}}} \tag{1.9}
$$

and

$$\overline{p\_{\text{overall}}} < \frac{2^k}{2^{n+1}} \sum\_{d=0}^n \binom{n}{d} \text{erfc}\left(\sqrt{d \frac{k}{n} \frac{E\_b}{N\_0}}\right). \tag{1.10}$$

An analytic solution may be obtained by observing that <sup>1</sup> <sup>2</sup> erfc(*y*) is upper bounded by e−*y*<sup>2</sup> and therefore,

$$\overline{p\_{\text{overall}}} < \frac{2^k}{2^n} \sum\_{d=0}^n \binom{n}{d} \text{e}^{-d\frac{k}{n}\frac{E\_b}{N\_0}} \tag{1.11}$$

and as observed in [21],

$$\left(1+\mathbf{e}^{-\frac{k}{n}\frac{E\_b}{N\_0}}\right)^n = \sum\_{d=0}^n \binom{n}{d} \mathbf{e}^{-d\frac{k}{n}\frac{E\_b}{N\_0}}\tag{1.12}$$

and

$$\overline{p\_{\mathbf{C}}} < \frac{1}{2^n} \left( 1 + \mathbf{e}^{-\frac{\mathbf{k}}{n} \frac{E\_k}{N\_0}} \right)^n \tag{1.13}$$

$$\overline{P\_{\text{overall}}} < \frac{\mathfrak{D}^k}{\mathfrak{D}^n} \left( 1 + \mathbf{e}^{-\frac{k}{n}\frac{E\_k}{N\_0}} \right)^n \tag{1.14}$$

Traditionally, a cut-off rate *R*<sup>0</sup> is defined after observing that

$$\frac{2^k}{2^n} \left( 1 + \mathbf{e}^{-\frac{k}{\pi} \frac{E\_k}{N\_0}} \right)^n = 2^k \left( \frac{1 + \mathbf{e}^{-\frac{k}{\pi} \frac{E\_k}{N\_0}}}{2} \right)^n \tag{1.15}$$

with

$$\mathcal{Z}^{\mathbb{R}\_0} = \frac{2}{1 + \mathbf{e}^{-\frac{\mathbf{k}}{\boldsymbol{\pi}} \frac{E\_\mathbf{k}}{N\_0}}} \tag{1.16}$$

**Fig. 1.1** Approximate and exact Gallager bounds for (128, 264), (256, 2128) and (512, 2256) nonlinear binary codes

then

$$\overline{p\_{\text{overall}}} < 2^k 2^{-nR\_0} = 2^{k - nR\_0} = 2^{-n(R\_0 - \frac{k}{n})} \tag{1.17}$$

This result may be interpreted as providing the number of information bits of the code is less than the length of the code times the cut-off rate, then the probability of decoder error will approach zero as the length of the code approaches infinity. Alternatively, provided the rate of the code, *<sup>k</sup> <sup>n</sup>* , is less than the cut-off rate, *R*0, then the probability of decoder error will approach zero as the length of the code approaches infinity. The cut-off rate *R*0, particularly in the period from the late 1950s to the 1970s was used as a practical measure of the code rate of an achievable error-correction system [11, 20–22]. However, plotting the exact expression for probability of decoder error, Eq. (1.10), in comparison to the cut-off rate approximation Eq. (1.17), shows a significant difference in performance, as shown in Fig. 1.1. The codes shown are the (128, 2<sup>64</sup>), (256, 2<sup>128</sup>) and (512, 2<sup>256</sup>) code ensembles of nonlinear, random binary codes. It is recommended that the exact expression, Eq. (1.10) be evaluated unless the code in question is a long code. As a consequence, in the following sections we shall only use the exact Gallager bound.

Shown in Fig. 1.2 is the sphere packing lower bound, offset by the loss attributable to binary transmission and the Gallager upper bound for the (128, 2<sup>64</sup>), (256, 2<sup>128</sup>) and (512, 2<sup>256</sup>) nonlinear binary codes. For each code, the exact Gallager upper bound given by (1.10), is shown. One reason why Gallager's bound is some way

**Fig. 1.2** Sphere packing and Gallager bounds for (128, 264), (256, 2128) and (512, 2256) nonlinear binary codes

from the sphere packing lower bound as shown in Fig. 1.2 is that the bound is based on the union bound and counts all error events as if these are independent. Except for orthogonal codes, this produces increasing inaccuracy as the *Eb <sup>N</sup>*<sup>0</sup> is reduced. Equivalently expressed, double counting is taking place since some codewords include the support of other codewords. It is shown in the next section that for linear codes the Gallager bound may be improved by considering the erasure correcting capability of codes, viz. no (*n*, *k*) code can correct more than *n* − *k* erasures.

#### *1.1.1 Linear Codes with a Binomial Weight Distribution*

The weight enumerator polynomial of a code is defined as *A*(*z*) which is given by

$$A(z) = \sum\_{i=0}^{n} A\_i \ z^i \tag{1.18}$$

For many good and exceptional, linear, binary codes including algebraic and quasicyclic codes, the weight distributions of the codes closely approximates to a binomial distribution where,

8 1 Bounds on Error-Correction Coding Performance

$$A(z) = \frac{1}{2^{n-k}} \sum\_{i=0}^{n} \frac{n!}{(n-i)!i!} \ z^i \tag{1.19}$$

with coefficients *Ai* given by

$$A\_i = \frac{1}{2^{n-k}} \frac{n!}{(n-i)!i!} = \frac{1}{2^{n-k}} \binom{n}{i} . \tag{1.20}$$

Tables of the best-known linear codes have been published from time to time [3, 10, 13, 16, 19] and a regularly updated database is maintained by Markus Grassl [5]. Remembering that for a linear code, the difference between any two codewords is also a codeword, and hence the distribution of the Hamming distances between a codeword and all other codewords is the same as the weight distribution of the code. Accordingly, the overall probability of decoder error, for the same system as before using a bank of 2*<sup>k</sup>* matched filters with each filter matched to a codeword is upper bounded by

$$p\_{\text{overall}} < \frac{1}{2} \sum\_{d=0}^{n} A\_d \text{erfc}\left(\sqrt{d \frac{k}{n} \frac{E\_b}{N\_0}}\right) \tag{1.21}$$

For codes having a binomial weight distribution

$$p\_{\text{overall}} < \frac{1}{2} \sum\_{d=0}^{n} \frac{1}{2^{n-k}} \binom{n}{d} \text{erfc}\left(\sqrt{d \frac{k}{n} \frac{E\_b}{N\_0}}\right) \tag{1.22}$$

which becomes

$$p\_{\text{overall}} < \frac{2^k}{2^{n+1}} \sum\_{d=0}^n \binom{n}{d} \text{erfc}\left(\sqrt{d \frac{k}{n} \frac{E\_b}{N\_0}}\right). \tag{1.23}$$

It will be noticed that this equation is identical to Eq. (1.10). This leads to the somewhat surprising conclusion that the decoder error probability performance of some of the best-known, linear, binary codes is the same as the average performance of the ensemble of all randomly chosen, binary nonlinear codes having the same values for *n* and *k*. Moreover, some of the nonlinear codes must have better performance than their average, and hence some nonlinear codes must be better than the best-known linear codes.

A tighter upper bound than the Gallager bound may be obtained by considering the erasure correcting capability of the code. It is shown in Chap. 14 that for the erasure channel, given a probability of erasure, *p*, the probability of decoder error, *P*code(*p*), is bounded by

#### 1.1 Gallager's Coding Theorem 9

$$P\_{\text{Code}}(p) < \sum\_{s=d\_{\text{min}}}^{n-k} \sum\_{j=d\_{\text{min}}}^{s} A\_j \frac{(n-j)! \ (n-s)!}{(s-j)!} p^s (1-p)^{(n-s)} + \sum\_{s=n-k+1}^{n} p^s (1-p)^{(n-s)}.\tag{1.24}$$

In Eq. (1.24), the first term depends upon the weight distribution of the code while the second term is independent of the code. The basic principle in the above equation is that an erasure decoder error is caused if an erasure pattern includes the support of a codeword. Since no erasure pattern can be corrected if it contains more than *n* − *k* errors, only codewords with weight less than or equal to *n* − *k* are involved. Consequently, a much tighter bound is obtained than a bound based on the union bound as there is less likelihood of double counting error events.

Considering the maximum likelihood decoder consisting of a bank of correlators, a decoder error occurs if one correlator has a higher output than the correlator corresponding to the correct codeword where the two codewords differ in *s* bit positions. To the decoder, it makes no difference if the decoder error event is due to erasures, from the erasure channel, or Gaussian noise from the AWGN channel; the outcome is the same. For the erasure channel, the probability of this error event due to erasures, *P*erasure(*p*) is

$$P\_{\text{erasure}}(p) = p^{\text{s}} \tag{1.25}$$

The probability of this error event due to noise, *<sup>P</sup>*noise *Eb N*0 is

$$P\_{\text{noise}}\left(\frac{E\_b}{N\_0}\right) = \frac{1}{2}\text{erfc}\left(\sqrt{s\frac{k}{n}\frac{E\_b}{N\_0}}\right) \tag{1.26}$$

Equating Eqs. (1.25) to (1.26), for these probabilities gives a relationship between the erasure probability, *p* and *Eb <sup>N</sup>*<sup>0</sup> and the Hamming distance, *s*.

$$p^s = \frac{1}{2} \text{erfc}\left(\sqrt{s\frac{k}{n}\frac{E\_b}{N\_0}}\right) \tag{1.27}$$

For many codes, the erasure decoding performance is determined by a narrow range of Hamming distances and the variation in *Eb <sup>N</sup>*<sup>0</sup> as a function of *s* is insignificant. This is illustrated in Fig. 1.3 which shows the variation in *Es <sup>N</sup>*<sup>0</sup> as a function of *s* and *p*.

It is well known that the distance distribution for many linear, binary codes including BCH codes, Goppa codes, self-dual codes [7, 8, 10, 14] approximates to a binomial distribution. Accordingly,

$$A\_j \approx \frac{n!}{(n-j)! \, j! 2^{n-k}}.\tag{1.28}$$

**Fig. 1.3** *Es <sup>N</sup>*<sup>0</sup> as a function of Hamming distance *s* and erasure probability *p*

Substituting this into Eq. (1.24) produces

$$P\_{\text{code}}(p) < \sum\_{s=1}^{n-k} \frac{2^s - 1}{2^{n-k}} \binom{n}{s} p^s (1-p)^{(n-s)} + \sum\_{s=n-k+1}^{n} p^s (1-p)^{(n-s)} \tag{1.29}$$

With the assumption of a binomial weight distribution, an upper bound may be determined for the erasure performance of any (*n*, *k*) code, and in turn, equating Eq. (1.25) with Eq. (1.26) produces an upper bound for the AWGN channel. For example, Fig. 1.4 shows an upper bound of the erasure decoding performance of a (128, 64) code with a binomial weight distribution.

Using Eq. (1.27), the decoding performance may be expressed in terms of *Eb N*0 and Fig. 1.5 shows the upper bound of the decoding performance of the same code against Gaussian noise, as a function of *Eb N*0 .

The comparison of the sphere packing bound and the Gallager bounds is shown in Fig. 1.6. Also shown in Fig. 1.6 is the performance of the BCH (128, 64, 22) code evaluated using the modified Dorsch decoder. It can be seen from Fig. 1.6 that the erasure-based upper bound is very close to the sphere packing lower bound and tighter than the Gallager bound.

Figure 1.7 gives the bounds for the (512, 256) and (256, 128) codes. It will be noticed that the gap between the sphere packing bound and the erasure-based upper bound increases with code length, but is tighter than the Gallager bound.

**Fig. 1.4** Erasure decoding performance of a (128, 64) code with a binomial weight distribution

**Fig. 1.5** Decoding performance of a (128, 64) code with a binomial weight distribution for Gaussian noise

**Fig. 1.6** Comparison of sphere packing and Gallager bounds to the upper bound based on erasure performance for the (128, 64) code with a binomial weight distribution

**Fig. 1.7** Comparison of sphere packing and Gallager bounds to the upper bound based on erasure performance for (256, 128) and (512, 256) codes with a binomial weight distribution

#### *1.1.2 Covering Radius of Codes*

The covering radius of a code, *cr* if it is known, together with the weight spectrum of the low-weight codewords may be used to tighten the Union bound upper bound on decoder performance given by Eq. (1.23). The covering radius of a code is defined as the minimum radius which when placed around each codeword includes all possible *q<sup>n</sup>* vectors. Equivalently, the covering radius is the maximum number of hard decision errors that are correctable by the code. For a perfect code, such as the Hamming codes, the covering radius is equal to *dmin*−<sup>1</sup> <sup>2</sup> . For the [2*<sup>m</sup>* − 1, 2*<sup>m</sup>* − *m* − 1, 3] Hamming codes, the covering radius is equal to 1 and for the (23, 12, 7) Golay code the covering radius is equal to 3. As a corollary, for any received vector in Euclidean space, there is always a codeword within a Euclidean distance of *cr* + 0.5. It follows that the summation in Eq. (1.23) may be limited to codewords of weight 2*cr* + 1 to produce

$$p\_{\text{overall}} < \frac{2^k}{2^{n+1}} \sum\_{d=0}^{2c\_r+1} \binom{n}{d} \text{erfc}\left(\sqrt{d \frac{k}{n} \frac{E\_b}{N\_0}}\right). \tag{1.30}$$

#### *1.1.3 Usefulness of Bounds*

The usefulness of bounds may be realised from Fig. 1.8 which shows the performance of optimised codes and decoders all (512, 256) codes for a turbo code, LDPC code and a concatenated code.

#### **1.2 Bounds on the Construction of Error-Correcting Codes**

A code (linear or nonlinear), *C* , defined in a finite field of size *q* can be described with its length *n*, number of codewords1 *M* and minimum distance *d*. We use (*n*, *M*, *d*)*<sup>q</sup>* to denote these four important parameters of a code. Given any number of codes defined in a field of size *q* with the same length *n* and distance *d*, the code with the maximum number of codewords *M* is the most desirable. Equivalently, one may choose to fix *n*, *M* and *q* and maximise *d* or fix *M*, *d* and *q* and maximise *n*. As a result, it is of interest in coding theory to determine the maximum number of codewords possible of any code defined in a field of size *q*, with minimum distance *d* and length *n*. This number is denoted by *Aq* (*n*, *d*). Bounds on *Aq* (*n*, *d*) are indicators to the maximum performance achievable from any code with parameters (*n*, *M*, *d*)*<sup>q</sup>* . As a result, these bounds are especially useful when one constructs good error-correcting codes. The tables in [5] contain the best-known upper and lower bounds on *Aq* (*n*, *d*) for linear codes. The tables in [9] contain bounds on *A*2(*n*, *d*)for nonlinear binary codes.

<sup>1</sup>Where the code dimension *<sup>k</sup>* <sup>=</sup> log*<sup>q</sup> <sup>M</sup>*.

**Fig. 1.8** Comparison of sphere packing, Gallager and erasure-based bounds to the performance realised for a (512, 256, 18)turbo code,(512, 256, 14) LDPC code and (512, 256, 32) concatenated code

Lower bounds on *Aq* (*n*, *d*)tend to be code specific; however, there are several generic upper bounds. As an example, consider the best-known upper and lower bounds on *A*2(128, *d*) obtained from the tables in [5]. These are shown in Fig. 1.9 for the range 1 ≤ *d* ≤ 128. Optimal codes of length *n* = 128 are codes whose lower and upper bounds on *A*2(128, *d*) coincide. The two curves coincide when *k* is small and *d* is large or vice versa. The gap between the upper and lower bounds that exists for other values of *k* and *d* suggests that one can construct good codes with a larger number of codewords and improve the lower bounds. An additional observation is that extended BCH codes count as some of the known codes with the most number of codewords.

It is often useful to see the performance of codes as their code lengths become arbitrarily large. We define the information rate

$$\alpha\_q(\delta) = \lim\_{n \to \infty} \frac{\log\_q(A\_q(n, \delta n))}{n},\tag{1.31}$$

where <sup>δ</sup> <sup>=</sup> *<sup>d</sup> <sup>n</sup>* is called the relative distance. Since the dimension of the code is defined as *<sup>k</sup>* <sup>=</sup> log*<sup>q</sup>* (*Aq* (*n*, δ*n*)), then a bound on the information rate <sup>α</sup>*<sup>q</sup>* (δ) is a bound on *<sup>k</sup> n* , as *n* → ∞.

**Fig. 1.9** Upper and lower bounds on *A*2(128, *d*)

#### *1.2.1 Upper Bounds*

#### **1.2.1.1 Sphere Packing (Hamming) Bound**

Let *Vq* (*n*, *t*) represent the number of vectors in each sphere then,

$$V\_q(n,t) = \sum\_{i=0}^{t} \binom{n}{i} (q-1)^i. \tag{1.32}$$

**Theorem 1.1** (Sphere Packing Bound) *The maximum number of codewords Aq* (*n*, *d*) *is upper bounded by,*

$$A\_q(n,d) \le \frac{q^n}{\sum\_{i=0}^{l} \binom{n}{i} (q-1)^i}$$

*Proof* A code *C* is a subset of a vector space GF(*q*)*<sup>n</sup>*. Each codeword of *C* has only those vectors GF(*q*)*<sup>n</sup>* but not in *<sup>C</sup>* lying at a hamming distance *<sup>t</sup>* <sup>=</sup> *<sup>d</sup>*−<sup>1</sup> 2 from it since codewords are spaced at least *d* places apart. In other words, no codewords lie in a sphere of radius *t* around any codeword of *C* . As such, for counting purposes, these spheres can represent individual codewords. The Hamming bound counts the number of such non-overlapping spheres in the vector space GF(*q*)*<sup>n</sup>*.

Codes that meet this bound are called *perfect* codes. In order to state the asymptotic sphere packing bound, we first define the *q*ary entropy function, *Hq* (*x*), for the values 0 ≤ *x* ≤ *r*,

$$H\_q(\mathbf{x}) = \begin{cases} 0 & \text{if } \quad \mathbf{x} = \mathbf{0} \\ \mathbf{x} \log\_q(q - 1) - \mathbf{x} \log\_q \mathbf{x} - (1 - \mathbf{x}) \log\_q(1 - \mathbf{x}) & \text{if } \quad \mathbf{0} < \mathbf{x} \le r \end{cases} \tag{1.33}$$

**Theorem 1.2** (Asymptotic Sphere Packing Bound) *The information rate of a code* α*<sup>q</sup>* (δ) *is upper bounded by,*

$$\alpha\_q(\delta) \le 1 - H\_q\left(\frac{\delta}{2}\right)$$

*for the range* 0 < δ ≤ 1 − *q*−1*.*

#### **1.2.1.2 Plotkin Bound**

**Theorem 1.3** (Plotkin Bound) *Provided d* > θ*n, where* θ = 1 − *q*−1*, then,*

$$A\_q(n,d) \le \left\lfloor \frac{d}{d-\theta n} \right\rfloor$$

*Proof* Let *S* = *d*(**x**, **y**)for all codewords **x**, **y** ∈ *C* , and **x** = **y**, and *d*(**x**, **y**) denotes the hamming distance between codewords **x** and **y**. Assume that all the codewords of *C* are arranged in an *M* × *n* matrix *D*. Since *d*(**x**, **y**) ≥ *d*,

$$S \ge \frac{M!}{(M-2)!}d = M(M-1)d.\tag{1.34}$$

Let *ni*,α be the number of times an element α in the defining field of the code GF(*q*) occurs in the *i*th column of the matrix *D*. Then, α∈GF(*q*) *ni*,α = *M*. For each *ni*,α there

are *M* − *ni*,α entries of the matrix *D* in column *i* that have elements other than α. These entries are a hamming distance 1 from the *ni*,α entries and there are *n* possible columns. Thus,

$$S = n \sum\_{i=1}^{n} \sum\_{\alpha \in \text{GF}(q)} n\_{i,\alpha} (M - n\_{i,\alpha})$$

$$= nM^2 - \sum\_{i=1}^{n} \sum\_{\alpha \in \text{GF}(q)} n\_{i,\alpha}^2. \tag{1.35}$$

From the Cauchy–Schwartz inequality,

$$\left(\sum\_{a\in\text{GF}(q)} n\_{i,a}\right)^2 \le q \sum\_{a\in\text{GF}(q)} n\_{i,a}^2. \tag{1.36}$$

Equation (1.35) becomes,

$$S \le nM^2 - \sum\_{i=1}^n q^{-1} \left( \sum\_{a \in \text{GF}(q)} n\_{i,a} \right)^2 \tag{1.37}$$

Let θ = 1 − *q*−1,

$$\begin{split} S &\leq nM^2 - \sum\_{i=1}^n q^{-1} \left( \sum\_{a \in \text{GF}(q)} n\_{i,a} \right)^2 \\ &\leq nM^2 - q^{-1}nM^2 \\ &\leq n\theta M^2. \end{split} \tag{1.38}$$

Thus from (1.34) and (1.38) we have,

$$M(M-1)d \le S \le n\theta M^2 \tag{1.39}$$

$$M \le \left\lfloor \frac{d}{d - \theta n} \right\rfloor \tag{1.40}$$

and clearly *d* > θ*n*.

**Corollary 1.1** (Asymptotic Plotkin Bound) *The asymptotic Plotkin bound is given by,*

$$\begin{aligned} \alpha\_q(\delta) &= 0 \quad &\text{if} \quad \theta \le \delta \le 1\\ \alpha\_q(\delta) &\le 1 - \frac{\delta}{\theta} \quad &\text{if} \quad 0 \le \delta \le \theta. \end{aligned}$$

#### **1.2.1.3 Singleton Bound**

**Theorem 1.4** (Singleton Bound) *The maximum number of codewords Aq* (*n*, *d*) *is upper bounded by,*

$$A\_q(n,d) \le q^{n-d+1}.$$

Codes that meet this bound with equality, i.e. *d* = *n* − *k* + 1, are called maximum distance separable codes (MDS). The asymptotic Singleton bound is given Theorem1.5.

**Theorem 1.5** (Asymptotic Singleton Bound) *The information rate* α*<sup>q</sup>* (δ) *is upper bounded by,*

$$
\alpha\_q(n,\delta) \le 1 - \delta.
$$

The asymptotic Singleton bound does not depend on the field size *q* and is a straight line with a negative slope in a plot of α*<sup>q</sup>* (δ) against δ for every field.

#### **1.2.1.4 Elias Bound**

Another upper bound is the Elias bound [17]. This bound was discovered by P. Elias but was never published by the author. We only state the bound here as the proof is beyond the scope of this text. For a complete treatment see [6, 10].

**Theorem 1.6** (Elias Bound) *A code C of length n with codewords having weight at most w, w* < θ*n with* θ = 1 − *q*−<sup>1</sup> *has,*

$$d \le \frac{Mw}{M-1} \left(2 - \frac{w}{\theta n}\right)^2$$

**Theorem 1.7** (Asymptotic Elias Bound) *The information rate* α*<sup>q</sup>* (δ) *is upper bounded by,*

$$\alpha\_q(\delta) \le 1 - H\_q(\theta - \sqrt{\theta(\theta - \delta)})$$

*provided* 0 <δ<θ *where* θ = 1 − *q*−1*.*

#### **1.2.1.5 MRRW Bounds**

The McEliece–Rodemich–Rumsey–Welch (MRRW) bounds are asymptotic bounds obtained using linear programming.

**Theorem 1.8** (Asymptotic MRRW Bound I) *Provided* 0 < *r* < θ*,* θ = 1 − *q*−<sup>1</sup> *then,*

$$\alpha\_q(\delta) \le H\_q\left(\frac{1}{q}(q-1-(q-2)\delta -2\sqrt{\delta(1-\delta)(q-1)})\right)$$

The second MRRW bound applies to the case when *q* = 2.

**Theorem 1.9** (MRRW Bound II) *Provided* 0 <δ< <sup>1</sup> <sup>2</sup> *and q* = 2 *then,*

$$\alpha\_2(\delta) \le \min\_{0 \le u \le 1-2\delta} \{ 1 + g(u^2) - g(u^2 + 2\delta u + 2\delta) \}$$

*where*

$$g(x) = H\_2\left(\frac{1 - \sqrt{1 - x}}{2}\right).$$

The MRRW bounds are the best-known upper bound on the information rate for the binary case. The MRRW-II bound is better than the MRRW-I bound when δ is small and *q* = 2. An in depth treatment and proofs of the bounds can be found in [12].

#### *1.2.2 Lower Bounds*

#### **1.2.2.1 Gilbert–Varshamov Bound**

**Theorem 1.10** (Gilbert–Varshamov Bound) *The maximum number of codewords Aq* (*n*, *d*) *is lower bounded by,*

$$A\_q(n,d) \ge \frac{q^n}{V\_q(n,d-1)} = \frac{q^n}{\sum\_{i=0}^{d-1} \binom{n}{i} (q-1)^i}.$$

*Proof* We know that *Vq* (*n*, *d* − 1) represents the volume of a sphere centred on a codeword of *C* of radius *d* − 1. Suppose *C* has *Aq* (*n*, *d*) codewords. Every vector **<sup>v</sup>** <sup>∈</sup> <sup>F</sup>*<sup>n</sup> <sup>q</sup>* lies within a sphere of volume *Vq* (*n*, *d* − 1) centred at a codeword of *C* as such,

$$\left| \bigcup\_{i=1}^{A\_q(n,d)} \mathcal{S}\_i \right| = |\mathbb{F}\_q^n|,$$

where *Si* is a set containing all vectors in a sphere of radius *d* − 1 centred on a codeword of*C* . The spheres *Si* are not mutually disjoint. If we assume *Si* are mutually disjoint then,

$$A\_q(n,d)V\_q(n,d-1) \ge |\mathbb{F}\_q^n|.$$

**Theorem 1.11** *The information rate of a code is lower bounded by,*

$$\alpha\_q(\delta) \ge 1 - H\_q(\delta)$$

*for* 0 ≤ δ ≤ θ*,* θ = 1 − *q*−<sup>1</sup>*.*

Figures 1.10 and 1.11 show the asymptotic upper and lower bounds for the cases where *q* = 2 and *q* = 32, respectively. Figure 1.11 shows that the MRRW bounds are the best-known upper bounds when *q* = 2. Observe that the Plotkin bound is the best upper bound for the case when *q* = 32.

**Fig. 1.10** α*<sup>q</sup>* (δ) against δ for *q* = 2

**Fig. 1.11** α*<sup>q</sup>* (δ) against δ for *q* = 32


#### *1.2.3 Lower Bounds from Code Tables*

Tables of best-known codes are maintained such that if a code defined in a field *q* is constructed with an evaluated and verifiable minimum Hamming distance *d* that exceeds a previously best-known code with the same length *n* and dimension, the dimension of the new code is a lower bound on *Aq* (*n*, *d*). The first catalogue of bestknown codes was presented by Calabi and Myrvaagnes [2] containing binary codes of length *n* and dimension *k* in the range 1 ≤ *k* ≤ *n* ≤ 24. Brouwer and Verhoeff [1] subsequently presented a comprehensive update to the tables which included codes with finite fields up to size 9 with the ranges for *k* and *n*.

At present, Grassl [5] maintains a significantly updated version of the tables in [1]. The tables now contain codes with *k* and *n* in ranges from Table 1.1. Finally, Schimd and Shurer [15] provide an online database for optimal parameters of (*t*, *m*,*s*) nets,(*t*,*s*)-sequences, orthogonal arrays, linear codes and ordered orthogonal arrays. These are relatively new tables and give the best-known codes up to finite fields of size 256. The search for codes whose dimension exceeds the best-known lower bounds on *Aq* (*n*, *d*) is an active area of research with the research community constantly finding improvements.

#### **1.3 Summary**

In this chapter we discussed the theoretical performance of binary codes for the additive white Gaussian noise (AWGN) channel. In particular the usefulness of Gallager's coding theorem for binary codes was explored. By assuming a binomial weight distribution for linear codes, it was shown that the decoder error probability performance of some of the best, known linear, binary codes is the same as the average performance of the ensemble of all randomly chosen, binary nonlinear codes having the same length and dimension. Assuming a binomial weight distribution, an upper bound was determined for the erasure performance of any code, and it was shown that this can be translated into an upper bound for code performance in the AWGN channel. Different theoretical bounds on the construction of error-correction codes were discussed. For the purpose of constructing good error-correcting codes, theoretical upper bounds provide fundamental limits beyond which no improvement is possible.

#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the book's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the book's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Chapter 2 Soft and Hard Decision Decoding Performance**

## **2.1 Introduction**

This chapter is concerned with the performance of binary codes under maximum likelihood soft decision decoding and maximum likelihood hard decision decoding. Maximum likelihood decoding gives the best performance possible for a code and is therefore used to assess the quality of the code. In practice, maximum likelihood decoding of codes is computationally difficult, and as such, theoretical bounds on the performance of codes are used instead. These bounds are in lower and upper form and the expected performance of the code is within the region bounded by the two. For hard decision decoding, lower and upper bounds on maximum likelihood decoding are computed using information on the coset weight leader distribution. For maximum likelihood soft decision decoding, the bounds are computed using the weight distribution of the codes. The union bound is a simple and well-known bound for the performance of codes under maximum likelihood soft decision decoding. The union bound can be expressed as both an upper and lower bound. Using these bounds, we see that as the SNR per bit becomes large the performance of the codes can be completely determined by the lower bound. However, this is not the case with the bounds on maximum likelihood hard decision decoding of codes. In general, soft decision decoding has better performance than hard decision decoding and being able to estimate the performance of codes under soft decision decoding is attractive. Computation of the union bound requires the knowledge of the weight distribution of the code. In Sect. 2.3.1, we use a binomial approximation for the weight distribution of codes for which the actual computation of the weight distribution is prohibitive. As a result, it possible to calculate within an acceptable degree of error the region in which the performance of codes can be completely predicted.

#### **2.2 Hard Decision Performance**

#### *2.2.1 Complete and Bounded Distance Decoding*

Hard decision decoding is concerned with decoding of the received sequence in hamming space. Typically, the real-valued received sequence is quantised using a threshold to a binary sequence. A bounded distance decoder is guaranteed to correct all *t* errors or less, where *t* is called the packing radius and is given by:

$$t = \left\lfloor \frac{d-1}{2} \right\rfloor$$

and *d* is the minimum hamming distance of the code. Within a sphere centred around a codeword in the hamming space of radius *t* there is no other codeword, and the received sequence in this sphere is closest to the codeword. Beyond the packing radius, some error patterns may be corrected. A complete decoder exhaustively matches all codewords to the received sequence and selects the codeword with minimum hamming distance. A complete decoder is also called a minimum distance decoder or maximum likelihood decoder. Thus, a complete decoder corrects some patterns of error beyond the packing radius. The complexity of implementing a complete decoder is known to be NP-complete [3]. Complete decoding can be accomplished using a standard array. In order to discuss standard array decoding, we first need to define cosets and coset leaders.

**Definition 2.1** A coset of a code *C* is a set containing all the codewords of *C* corrupted by a single sequence **<sup>a</sup>** <sup>∈</sup> <sup>F</sup>*<sup>n</sup> <sup>q</sup>* \ *C* ∪ {**0**}.

A coset of a binary code contains 2*<sup>k</sup>* sequences and there are 2*<sup>n</sup>*−*<sup>k</sup>* possible cosets. Any sequence of minimum hamming weight in a coset can be chosen as a coset leader. In order to use a standard array, the coset leaders of all the cosets of a code must be known. We illustrate complete decoding with an example. Using a (7, 3) dual Hamming code with the following generator matrix

$$G = \begin{bmatrix} 1 \ 0 \ 0 \ 0 \ 1 \ 1 \ 1 \\ 0 \ 1 \ 0 \ 1 \ 0 \ 1 \ 1 \\ 0 \ 0 \ 1 \ 1 \ 1 \ 0 \ 1 \end{bmatrix}$$

This code has codewords

$$C = \begin{Bmatrix} 0\ 0\ 0\ 0\ 0\ 0\ 0\\ 1\ 0\ 0\ 0\ 1\ 1\ 1\\ 0\ 1\ 0\ 1\ 0\ 1\ 1\\ 0\ 0\ 1\ 1\ 1\ 0\ 1\\ 1\ 1\ 0\ 1\ 1\ 0\ 0\\ 0\ 1\ 1\ 0\ 1\ 1\ 0\\ 1\ 0\ 1\ 1\ 0\ 1\ 0\\ 1\ 1\ 1\ 0\ 0\ 0\ 1\ \end{Bmatrix}$$


Coset Leaders

**Fig. 2.1** Standard array for the (7, 3, 4) binary code

Complete decoding can be accomplished using standard array decoding. The example code is decoded using standard array decoding as follows, The top row of the array in Fig. 2.1 in bold contains the codewords of the (7, 3, 4) code.1 Subsequent rows contain all the other cosets of the code with the array arranged so that the coset leaders are in the first column. The decoder finds the received sequence on a row in the array and then subtracts the coset leader corresponding to that row from it to obtain a decoded sequence. The standard array is partitioned based on the weight of the coset leaders. Received sequences on rows with coset leaders of weight less than or equal to *<sup>t</sup>* <sup>=</sup> <sup>3</sup>−<sup>1</sup> <sup>2</sup> = 1 are all corrected. Some received sequences on rows with coset leaders with weight greater than *t* are also corrected. Examining the standard array, it can be seen that the code can correct all single error sequences, some two error sequences and one three error sequence. The coset weight C*<sup>i</sup>* distribution is

$$\begin{aligned} \mathbb{C}\_0 &= 1 \\ \mathbb{C}\_1 &= 7 \\ \mathbb{C}\_2 &= 7 \\ \mathbb{C}\_3 &= 1 \end{aligned}$$

The covering radius of the code is the weight of the largest coset leader (in this example it is 3).

<sup>1</sup>It is worth noting that a code itself can be considered as a coset with the sequence **a** an all zero sequence.

# *2.2.2 The Performance of Codes on the Binary Symmetric Channel*

Consider a real-valued sequence received from a transmission through an AWGN channel. If a demodulator makes hard decisions at the receiver, the channel may be modelled as a binary symmetric channel. Assuming the probability of bit error for the BSC is *p*, the probability of decoding error with a bounded distance decoder is given by,

$$P\_{\text{BDD}}(e) = 1 - \sum\_{i=0}^{t} \mathbb{C}\_i p^i (1 - p)^{n - i} \tag{2.1}$$

where <sup>C</sup>*<sup>i</sup>* is the number of coset leaders with weight *<sup>i</sup>*. <sup>C</sup>*<sup>i</sup>* known for 0 <sup>≤</sup> *<sup>i</sup>* <sup>≤</sup> *<sup>t</sup>* and is given by,

$$\mathbb{C}\_{i} = \binom{n}{i} \quad 0 \le i \le t.$$

However, C*<sup>i</sup>* , *i* > *t* need to be computed for individual codes. The probability of error after full decoding is

$$P\_{\text{Full}}(e) = 1 - \sum\_{i=0}^{n} \mathbb{C}\_i p^i (1-p)^{n-i}. \tag{2.2}$$

Figure 2.2 shows the performance of the bounded distance decoder and the full decoder for different codes. The bounds are computed using (2.1) and (2.2). As expected, there is significant coding gain between unencoded and coded transmission (bounded distance and full decoding) for all the cases. There is a small coding gain between bounded distance and full decoders. This coding gain depends on the coset leader weight distribution C*<sup>i</sup>* for *i* > *t* of the individual codes. The balance between complexity and performance for full and bounded distance decoders2 ensures that the latter are preferred in practice. Observe that in Fig. 2.2 that the complete decoder consistently outperforms the bounded distance decoder as the probability of error decreases and *Eb <sup>N</sup>*<sup>0</sup> increases. We will see in Sect. 2.3 that a similar setup using soft decision decoding in Euclidean space produces different results.

#### **2.2.2.1 Bounds on Decoding on the BSC Channel**

Suppose *s* is such that C*<sup>s</sup>* is the maximum non-zero value for a code then *s* is the covering radius of the code. If the covering radius *s* of a code is known and C*<sup>i</sup>* , *i* > *t* are not known, then the probability of error after decoding can be bounded by

<sup>2</sup>Bounded distance decoders usually have polynomial complexity, e.g. the Berlekamp Massey decoder for BCH codes has complexity *O*(*t*2) [1].

**Fig. 2.2** BCH code BDD and full decoder performance, frame error rate (FER) against *Eb N*0

$$P\_e \ge 1 - \left[ \sum\_{i=0}^{l} \binom{n}{i} p^i (1-p)^{n-i} + p^s (1-p)^{n-s} \right] \tag{2.3}$$

$$1 \le 1 - \left[ \sum\_{i=0}^{t} \binom{n}{i} p^i (1-p)^{n-i} + \mathbb{W}\_s p^s (1-p)^{n-s} \right] \tag{2.4}$$

assuming the code can correct *t* errors and

$$\mathbb{W}\_{\mathbf{z}} = \mathfrak{Z}^{n-k} - \sum\_{i=0}^{t} \binom{n}{i} \mathbf{z}\_i$$

The lower bound assumes that there is a single coset leader of weight *s*, and hence the term *p<sup>s</sup>*(1 − *p*)*<sup>n</sup>*−*<sup>s</sup>* while the upper bound assumes that all the coset leaders of weight greater than *t* have weight equal to the covering radius *s*. For the lower bound to hold, <sup>W</sup>*<sup>s</sup>* <sup>≥</sup> 1. The lower bound can be further tightened by assuming that the <sup>W</sup>*<sup>s</sup>* <sup>−</sup> 1 cosets have weight of *<sup>t</sup>* <sup>+</sup> <sup>1</sup>, *<sup>t</sup>* <sup>+</sup> <sup>2</sup>,... until they can all be accounted for.3

#### **2.3 Soft Decision Performance**

The union bound for the probability of sequence error using maximum likelihood soft decoding performance on binary codes with BPSK modulation in the AWGN channel is given by [2],

$$P\_s \le \frac{1}{2} \sum\_{j=1}^n A\_j \text{ erfc}\left(\sqrt{\frac{\mathcal{E}\_b}{\mathcal{N}\_0}} \mathcal{R}j\right) \tag{2.5}$$

where *R* is the code rate, *Aj* is the number of codewords of weight *j* and *Eb <sup>N</sup>*<sup>0</sup> is the SNR per bit. The union bound is obtained by assuming that events in which the received sequence is closer in euclidean distance to a codeword of weight *j* are independent as such the probability of error is the sum of all these events. A drawback to the exact computation of the union bound is the fact that the weight distribution *Aj* , 0 ≤ *j* ≤ *n* of the code is required. Except for a small number of cases, the complete weight distribution of many codes is not known due to complexity limitations. Since *Aj* = 0 for 1 ≤ *j* < *d* where *d* is the minimum distance of the code we can express (2.5) as,

$$P\_s \le \frac{1}{2} \sum\_{j=d}^n A\_j \operatorname{erfc} \left( \sqrt{\frac{\mathbf{E}\_b}{\mathbf{N}\_0}} \mathbf{R}j \right) \tag{2.6}$$

$$\leq \frac{1}{2} \operatorname{A}\_{d} \operatorname{erfc} \left( \sqrt{\frac{\operatorname{E}\_{\mathrm{b}}}{\operatorname{N}\_{0}}} \operatorname{Rd} \right) + \frac{1}{2} \sum\_{j=4+1}^{n} \operatorname{A}\_{j} \operatorname{erfc} \left( \sqrt{\frac{\operatorname{E}\_{\mathrm{b}}}{\operatorname{N}\_{0}}} \operatorname{Rj} \right) \tag{2.7}$$

A lower bound on the probability of error can be obtained if it is assumed that error events occur only when the received sequence is closer in euclidean distance to codewords at a distance *d* from the correct codeword.

$$P\_s \geq \frac{1}{2} A\_d \operatorname{erfc} \left( \sqrt{\frac{\mathbf{E\_b}}{\mathbf{N\_0}}} \mathbf{Rd} \right) \tag{2.8}$$

<sup>3</sup>This can be viewed as the code only has one term at the covering radius, and all other terms are at *t* + 1.

#### 2.3 Soft Decision Performance 31

where

$$\frac{1}{2} \sum\_{j=d+1}^{n} A\_j \operatorname{erfc} \left( \sqrt{\frac{\mathcal{E}\_\mathrm{b}}{\mathcal{N}\_0}} \mathrm{Rj} \right) = 0. \tag{2.9}$$

As such,

$$\frac{1}{2}\operatorname{A}\_{d}\operatorname{erfc}\left(\sqrt{\frac{\operatorname{E}\_{\mathrm{b}}}{\operatorname{N}\_{0}}\mathrm{Rd}}\right) \leq \operatorname{P}\_{\mathrm{s}} \leq \frac{1}{2}\sum\_{j=4}^{n} \operatorname{A}\_{j}\operatorname{erfc}\left(\sqrt{\frac{\operatorname{E}\_{\mathrm{b}}}{\operatorname{N}\_{0}}\mathrm{Rj}}\right) \tag{2.10}$$

Therefore, the practical soft decision performance of a binary code lies between the upper and lower Union bound. It will be instructive to observe the union bound performance for actual codes using their computed weight distributions as the SNR per bit *Eb <sup>N</sup>*<sup>0</sup> increases. By allowing *Eb <sup>N</sup>*<sup>0</sup> to become large (and *Ps* to decrease) simulations for several codes suggest that at a certain *intersection* value of *Eb <sup>N</sup>*<sup>0</sup> the upper bound equals the lower bound. Consider Figs. 2.3, 2.4 and 2.5 which show the frame error rate against the SNR per bit for three types of codes. The upper bounds in the figures are obtained using the complete weight distribution of the codes with Eq. (2.5). The lower bounds are obtained using only the number of codewords of minimum weight of the codes with Eq. (2.8). It can be observed that as *Eb <sup>N</sup>*<sup>0</sup> becomes large, the upper bound meets and equals the lower bound. The significance of this observation is that for *Eb <sup>N</sup>*<sup>0</sup> values above the point where the two bounds intersect the performance of the codes under soft decision can be completely determined by the lower bound (or the upper bound). In this region where the bounds agree, when errors occur they do so because the received sequence is closer to codewords a distance *d* away from the correct codeword. The actual performance of the codes before this region is somewhere between the upper and lower bounds. As we have seen earlier, the two bounds agree when the sum in (2.9) approaches 0. It may be useful to consider an approximation of the complementary error function (erfc),

erfc(x) < e−x<sup>2</sup>

in which case the condition becomes

$$\frac{1}{2} \sum\_{j=d+1}^{n} A\_j \text{ e }^{-\frac{\text{E}\_\text{Rj}}{\text{Rj}}} \approx 0.\tag{2.11}$$

Clearly, the sum approximates to zero if each term in the sum also approximates to zero. It is safe to assume that the term *Aj* erfc Eb N0 Rj decreases as *j* increases since erfc Eb N0 Rj reduces exponentially with *j* and *Aj* increases in a binomial (in most cases). The size of the gap between the lower and upper bounds is also

**Fig. 2.3** Extended BCH code lower and upper union bound performance, frame error rate (FER) against *Eb N*0

determined by these terms. Each term *Aj* e <sup>−</sup> Eb N0 Rj becomes small if one or both of the following conditions are met,


Observing Fig. 2.3, 2.4 and 2.5, it can be seen that at small values of *Eb <sup>N</sup>*<sup>0</sup> and for low rate codes for which *<sup>R</sup>* <sup>=</sup> *<sup>k</sup> <sup>n</sup>* is small have some *Aj* = 0, *j* > *d* and as such the gaps

**Fig. 2.4** BCH code lower and upper union bound performance, frame error rate (FER) against *Eb N*0

between the upper and lower bounds are small. As an example consider the low rate (127, 22, 47) BCH code in Fig. 2.4a which has,

$$A\_j = 0 \quad j \in \{49\dots 54\} \cup \{57\dots 62\} \cup \{65\dots 70\} \cup \{78\dots 78\} \cup \{81\dots 126\}.$$

For the high rate codes, *R* is large so that the product *Eb <sup>N</sup>*<sup>0</sup> *Rj* becomes very large therefore the gaps between the upper and lower bounds are small.

Figure 2.6 compares bounded distance decoding and full decoding with maximum likelihood soft decision decoding of the (63, 39) and (63, 36) BCH codes. It can be seen from the figure that whilst the probability of error for maximum likelihood

**Fig. 2.5** Reed–Muller code lower and upper union bound performance, frame error rate (FER) against *Eb N*0

hard decision decoding is smaller than that of bounded distance decoding for all the values of *Eb N*0 , the upper bound on the probability of error for maximum likelihood soft decision decoding agrees with the lower bound from certain values of *Eb N*0 . This suggests that for soft decision decoding, the probability of error can be accurately determined by the lower union bound from a certain value of *Eb N*0 . Computing the lower union bound from (2.10) requires only the knowledge of the minimum distance of the code *d* and the multiplicity of the minimum weight terms *Ad* . In practice, *Ad* is much easier to obtain than the complete weight distribution of the code.

**Fig. 2.6** BCH code: Bounded distance, full and maximum likelihood soft decoding

#### *2.3.1 Performance Assuming a Binomial Weight Distribution*

Evaluating the performance of long codes with many codewords using the union upper bound is difficult since one needs to compute the complete weight distribution of the codes. For many good linear binary codes, the weight distributions of the codes closely approximates to a binomial distribution. Computing the weight distribution of a binary code is known to be NP-complete [3]. Let *Eb N*0 δ be defined as,

**Fig. 2.7** Union bounds using binomial and actual weight distributions (WD) for best known codes

$$\frac{1}{2}\operatorname{A}\_{d}\operatorname{erfc}\left(\sqrt{\frac{\mathcal{E}\_{\mathrm{b}}}{\mathcal{N}\_{0}}\mathrm{Rd}}\right)\Big|\_{\frac{\mathrm{E}\_{\mathrm{b}}}{\mathrm{N}\_{0}}=\left(\frac{\mathrm{E}\_{\mathrm{b}}}{\mathrm{N}\_{0}}\right)\_{\mathrm{s}}} \approx \frac{1}{2}\sum\_{j=\mathrm{d}}^{\mathrm{n}}\operatorname{A}\_{j}\operatorname{erfc}\left(\sqrt{\frac{\mathcal{E}\_{\mathrm{b}}}{\mathcal{N}\_{0}}\mathrm{Rj}}\right)\Big|\_{\frac{\mathrm{E}\_{\mathrm{b}}}{\mathrm{N}\_{0}}=\left(\frac{\mathrm{E}\_{\mathrm{b}}}{\mathrm{N}\_{0}}\right)\_{\mathrm{s}}}.\tag{2.12}$$

Hence, *Eb N*0 δ is the SNR per bit at which the difference between upper and lower union bound for the code is very small. It is worth noting that equality is only possible when *Eb <sup>N</sup>*<sup>0</sup> approaches infinity in (2.12) since lim*<sup>x</sup>*→∞erfc(x) <sup>=</sup> 0. To find *Eb N*0 δ for a binary code (*n*, *k*, *d*) we simply assume a binomial weight distribution for the code so that,

$$A\_i = \frac{2^k}{2^n} \binom{n}{i} \tag{2.13}$$

**Fig. 2.8** Union bounds using binomial and actual weight distributions (WD) for the (255, 120, 40) best known code

and compute an *Eb <sup>N</sup>*<sup>0</sup> value that satisfies (2.12). It must be noted that *Eb N*0 δ obtained using this approach is only an estimate. The accuracy of *Eb N*0 <sup>δ</sup> depends on how closely the weight distribution of the code approximates to a binomial and how small the difference between the upper and lower union bounds *P*upper − *P*lower is. Consider Fig. 2.7 that show the upper and lower union bounds using binomial weight distributions and the actual weight distributions of the codes. From Fig. 2.7a, it can be seen that for the low rate code (127, 30, 37) the performance of the code using the binomial approximation of the weight distribution does not agree with the performance using the actual weight distribution at low values of *Eb N*0 . Interestingly Fig. 2.7b–d show that as the rate of the codes increases the actual weight distribution of the codes approximates to a binomial. The difference in the performance of the codes using the binomial approximation and actual weight distribution decreases as *Eb <sup>N</sup>*<sup>0</sup> increases. Figure 2.8 shows the performance of the (255, 120, 40) using a binomial weight distribution. An estimate for *Eb N*0 δ from the figure is 5.2 dB. Thus for *Eb <sup>N</sup>*<sup>0</sup> ≥ 5.2 dB, we can estimate the performance of the (255, 120, 40) code under maximum likelihood soft decision decoding in the AWGN channel using the lower union bound.

FER

**Fig. 2.9** Performance of self-dual codes

#### *2.3.2 Performance of Self-dual Codes*

A self-dual code *C* has the property that it is its own dual such that,

$$
\aleph\_{\ell} = \aleph\_{\perp}.
$$

Self-dual codes are always half rate with parameters (*n*, <sup>1</sup> <sup>2</sup> *n*, *d*). These codes are known to meet the Gilbert–Varshamov bound and some of the best known codes are self-dual codes. Self-dual codes form a subclass of formally self-dual codes which have the property that,

$$W(\emptyset) = W(\emptyset^\perp).$$

where *W*(*C* ) means the weight distribution of *C* . The weight distribution of certain types of formally self-dual codes can be computed without enumerating all the codewords of the code. For this reason, these codes can readily be used for analytical purposes. The fact that self-dual codes have the same code rate and good properties makes them ideal for performance evaluation of codes of varying length. Consider Fig. 2.9 which shows the performance of binary self-dual (and formally self-dual) codes of different lengths using the upper and lower union bounds with actual weight distributions, bounded distance decoding and unencoded transmission. Figure 2.10

**Fig. 2.10** Coding gain against code length for self-dual codes at FER 10−<sup>10</sup> and 10−<sup>20</sup>

shows the coding gain of the self-dual codes at frame error rates (FER) 10−<sup>10</sup> and 10−<sup>20</sup> for soft decision decoding (SDD) and bounded distance decoding (BDD). The coding gain represents the difference in dB between the SDD/BDD performance and unencoded transmission. The coding gain is a measure of the power saving obtainable from a coded system relative to an unencoded system in dB at a certain probability of error. The SDD performance of codes with length 168, 136 and 128 at FER 10−<sup>10</sup> are obtained from the union upper bound because the upper and lower bound do not agree at this FER. Thus, the coding gain for these cases is a lower bound. It is instructive to note that the difference between the coding gain for SDD and BDD at the two values of FER increases as the length of the code increases. At FER of 10−<sup>20</sup> SDD gives 3.36 dB coding gain over BDD for the code of length 168 and 2.70 dB for the code of length 24. At a FER of 10−10, SDD gives 3.70 dB coding gain over BDD for the code of length 168 and 2.44 dB for the code of length 24.

#### **2.4 Summary**

In this chapter, we discussed the performance of codes under hard and soft decision decoding. For hard decision decoding, the performance of codes in the binary symmetric channel was discussed and numerically evaluated results for the bounded distance decoder compared to the full decoder were presented for a range of codes whose coset leader weight distribution is known. It was shown that as the SNR per information bit increases there is still an observable difference between bounded distance and full decoders. A lower and upper bound for decoding in the BSC was also given for cases where the covering radius of the code is known. For soft decision decoding, the performance of a wide range of specific codes was evaluated numerically using the union bounds. The upper and lower union bounds were shown to converge for all codes as the SNR per information bit increases. It was apparent that for surprisingly low values of *Eb <sup>N</sup>*<sup>0</sup> the performance of a linear code can be predicted by only using knowledge of the multiplicity of codewords of minimum weight. It was also shown for those codes whose weight distribution is difficult to compute, a binomial weight distribution can be used instead.

## **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the book's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the book's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Chapter 3 Soft Decision and Quantised Soft Decision Decoding**

#### **3.1 Introduction**

The use of hard decision decoding results in a decoding loss compared to soft decision decoding. There are several references that have quantified the loss which is a function of the operating *Eb <sup>N</sup>*<sup>0</sup> ratio, the error-correcting code and the quantisation of the soft decisions. Wozencraft and Jacobs [6] give a detailed analysis of the effects of soft decision quantisation on the probability of decoding error,*Pec*, for the ensemble of all binary codes of length n without restriction of the choice of code. Their analysis follows from the Coding Theorem, presented by Gallager for the ensemble of random binary codes [3].

#### **3.2 Soft Decision Bounds**

There are 2*<sup>n</sup>* possible binary combinations for each codeword, which in terms of the n-dimensional signal space hypercube corresponds to one vertex taken from 2*<sup>n</sup>* possible vertices. There are 2*<sup>k</sup>* codewords and therefore 2*nk* different possible codes. The receiver is considered to be composed of 2*<sup>k</sup>* matched filters, one for each codeword, and a decoder error occurs if any of the matched filter receivers has a larger output than the matched filter receiver corresponding to the transmitted codeword. Consider this matched filter receiver and another different matched filter receiver, and consider that the two codewords differ in *d* bit positions. The Hamming distance between the two codewords is *<sup>d</sup>*. The energy per transmitted bit is *Es* <sup>=</sup> *<sup>k</sup> <sup>n</sup> Eb*, where *Eb* is the energy per information bit. The noise variance per matched filtered received bit, <sup>σ</sup><sup>2</sup> <sup>=</sup> *<sup>N</sup>*<sup>0</sup> <sup>2</sup> , where *N*<sup>0</sup> is the single sided noise spectral density. In the absence of noise, the output of the matched filter receiver for the transmitted codeword is *n* <sup>√</sup>*Es*, and the output of the other codeword matched filter receiver is (*n* − 2*d*) <sup>√</sup>*Es*. The noise voltage at the output of the matched filter receiver for the transmitted codeword is denoted as *nc* − *n*1, and the noise voltage at the output of the other matched filter receiver will be *nc* +*n*1. The common noise voltage *nc* arises from correlation of the bits common to both codewords with the received noise, and the noise voltages −*n*<sup>1</sup> and *n*<sup>1</sup> arise respectively from correlation of the other *d* bits with the received noise.

A decoder error occurs if

$$(n - 2d)\sqrt{E\_s} + n\_c + n\_1 > n\sqrt{E\_s} + n\_c - n\_1,\tag{3.1}$$

that is, a decoder error occurs when 2*n*<sup>1</sup> > 2*d* <sup>√</sup>*Es*.

The average noise power associated with *<sup>n</sup>*<sup>1</sup> is *<sup>d</sup>*σ<sup>2</sup> <sup>=</sup> *<sup>d</sup> <sup>N</sup>*<sup>0</sup> <sup>2</sup> , and as the noise is Gaussian distributed, the probability of decoder error, *pd* , is given by

$$p\_d = \frac{1}{\sqrt{\pi d N\_0}} \int\_{d\sqrt{E\_s}}^{\infty} \mathbf{e}^{\frac{-\mathbf{z}^2}{d N\_0}} d\mathbf{x}.\tag{3.2}$$

This may be expressed in terms of the complementary error function

$$\text{erfc}(\mathbf{y}) = 2 \frac{1}{\sqrt{2\pi}} \int\_{\mathbf{y}}^{\infty} \mathbf{e}^{\frac{-\mathbf{x}^2}{2}} d\mathbf{x} \tag{3.3}$$

and leads to

$$p\_d = \frac{1}{2} \text{erfc}\left(\sqrt{d\frac{k}{n}\frac{E\_b}{N\_0}}\right) \tag{3.4}$$

Each of the other 2*<sup>k</sup>* − 2 codewords may also cause a decoder error but the weight distribution of the code *Ci* is unknown. However, by averaging over all possible codes, knowledge of the weight distribution of a particular code is not required. The probability of two codewords of a code *Ci* , differing in *d* bit positions, *p*(*d*|*Ci*) is given by the Binomial distribution

$$p(d|C\_i) = \frac{\frac{n!}{(n-d)!d!}}{2^n} \tag{3.5}$$

A given linear code *Ci* cannot have codewords of arbitrary weight, because the sum of a sub-set of codewords is also a codeword. However, for non linear codes, *pd* may be averaged over all of the codes without this constraint.

$$\overline{p\_C} = \sum\_{i=1}^{2^{n^k}} p(d|C\_i)p(C\_i) < \frac{1}{2^{n^k}} \sum\_{d=0}^n \sum\_{i=1}^{2^{n^k}} \frac{\frac{n!}{(n-d)!d!}}{2^{n+1}} \text{erfc}\left(\sqrt{d \frac{k}{n} \frac{E\_b}{N\_0}}\right) \tag{3.6}$$

#### 3.2 Soft Decision Bounds 45

rearranging the order of summation

$$\overline{p\_C} < \frac{1}{2^{n2^k}} \sum\_{i=1}^{2^{n^k}} \sum\_{d=0}^n \frac{\frac{n!}{(n-d)!d!}}{2^{n+1}} \text{erfc}\left(\sqrt{d \frac{k}{n} \frac{E\_b}{N\_0}}\right) \tag{3.7}$$

and

$$\overline{p\_C} < \frac{1}{2^{n+1}} \sum\_{d=0}^{n} \frac{n!}{(n-d)!d!} \text{erfc}\left(\sqrt{d \frac{k}{n} \frac{E\_b}{N\_0}}\right) \tag{3.8}$$

Remembering that any of the 2*<sup>k</sup>* − 1 matched filters may cause a decoder error, the overall probability of decoder error averaged over all possible binary codes *p*overall, is

$$
\overline{p\_{\text{overall}}} = 1 - \left(1 - \overline{p\_C}\right)^{2^k - 1} < 2^k \overline{p\_C} \tag{3.9}
$$

and

$$\overline{p\_{\text{overall}}} < \frac{2^k}{2^{n+1}} \sum\_{d=0}^n \frac{n!}{(n-d)!d!} \text{erfc}\left(\sqrt{d \frac{k}{n} \frac{E\_b}{N\_0}}\right) \tag{3.10}$$

An analytic solution may be obtained by observing that <sup>1</sup> <sup>2</sup> erfc(*y*) is upper bounded by e−*y*<sup>2</sup> ,

$$\overline{p\_{\text{overall}}} < \frac{2^k}{2^n} \sum\_{d=0}^n \frac{n!}{(n-d)!d!} \mathbf{e}^{-d\frac{k}{n}\frac{E\_b}{N\_0}} \tag{3.11}$$

and as observed by Wozencraft and Jacobs [6],

$$(1 + \mathbf{e}^{-\frac{k}{n}\frac{E\_b}{N\_0}})^n = \sum\_{d=0}^n \frac{n!}{(n-d)!d!} \mathbf{e}^{-d\frac{k}{n}\frac{E\_b}{N\_0}}\tag{3.12}$$

and

$$\overline{p\_C} < \frac{1}{2^n} (1 + \mathbf{e}^{-\frac{\mathbf{b}}{n}\frac{F\_b}{N\_0}})^n \tag{3.13}$$

$$\overline{p\_{\text{overall}}} < \frac{2^k}{2^n} (1 + \mathbf{e}^{-\frac{\mathbf{k}}{n}\frac{E\_k}{N\_0}})^n \tag{3.14}$$

Traditionally, a cut-off rate *R*<sup>0</sup> is defined after observing that

$$2\frac{2^k}{2^n}(1+\mathbf{e}^{-\frac{k}{\pi}\frac{E\_k}{N\_0}})^n = 2^k \left(\frac{1+\mathbf{e}^{-\frac{k}{\pi}\frac{E\_k}{N\_0}}}{2}\right)^n \tag{3.15}$$

with

$$\mathcal{D}^{\mathcal{R}\_0} = \left(\frac{2}{1 + \mathbf{e}^{-\frac{k}{n}\frac{\mathcal{E}\_b}{N\_0}}}\right),\tag{3.16}$$

then

$$\overline{p\_{\text{overall}}} < 2^k 2^{-nR\_0} = 2^{k - nR\_0} = 2^{-n(R\_0 - \frac{k}{n})} \tag{3.17}$$

This result may be interpreted as, providing the number of information bits of the code is less than the length of the code times the cut-off rate, then the probability of decoder error will approach zero as the length of the code approaches infinity. Alternatively, provided the rate of the code, *<sup>k</sup> <sup>n</sup>* , is less than the cut-off rate, *R*0, then the probability of decoder error will approach zero as the length of the code approaches infinity.

When *s* quantised soft decisions are used with integer levels 0 to 2*s* −1, for*s* even and integer levels 0 to *s* − 1 for *s* odd, the transmitted binary signal has levels 0 and 2(*s* − 1), for *s* even and levels 0 and *s* − 1, for *s* odd and the probability distribution of the quantised signal (bit) plus noise, after matched filtering, has probability *pi* , *i* = 0 to *s* − 1, represented as

$$p(z) = \sum\_{i=0}^{s-1} p\_i z^{-2i}, \text{for s even} \tag{3.18}$$

and

$$p(z) = \sum\_{i=0}^{s-1} p\_i z^{-i}, \text{for } s \text{ odd} \tag{3.19}$$

A decoder error occurs if

$$\rm{s}(n - 2d) + n\_c + n\_1 > \rm{sn} + n\_c - n\_1 \tag{3.20}$$

and occurs when

$$m\_1 \succ sd \tag{3.21}$$

#### 3.2 Soft Decision Bounds 47

and has probability 0.5 when

$$m\_1 = sd\tag{3.22}$$

The probability of decoder error may be determined from a summation of terms from the overall probability distribution for the sum of *d* independent, quantised noise samples, and is given by a polynomial *qd* (*z*) at *z* = 0, where *qd* (*z*) is given by

$$q\_d(z) = p(z)^d \left( \frac{1 - z^{(s-1)d+1}}{1 - z} - 0.5z^{(s-1)d} \right), \text{ for } s \text{ even} \tag{3.23}$$

The 0.5*z*(*s*−1)*<sup>d</sup>* term corresponds to *n*<sup>1</sup> = *sd* when the probability of decoder error is 0.5.

$$q\_d(z) = p(z)^d \left( \frac{1 - z^{\frac{\iota - 1}{2}d + 1}}{1 - z} - 0.5 z^{\frac{\iota - 1}{2}d} \right), \text{when } \text{s is odd} \tag{3.24}$$

and the 0.5*z s*−1 <sup>2</sup> *<sup>d</sup>* term corresponds to *n*<sup>1</sup> = *sd* when the probability of decoder error is 0.5.

The probability of decoder error is given by *qd* (*z*) when *z* = 0,

$$p\_d = q\_d(0) \tag{3.25}$$

The evaluation of the average probability of decoder error for quantised soft decisions, *pCQ* is given, as before by averaging over all codes and rearranging the order of summation

$$\overline{pc\_{\mathcal{Q}}} < \frac{1}{2^{n2^k}} \sum\_{i=1}^{2^{n^k}} \sum\_{d=0}^{n} \frac{\frac{n!}{(n-d)!d!}}{2^n} q\_d(0) \tag{3.26}$$

Simplifying

$$\overline{pc\_{\mathcal{Q}}} < \sum\_{d=0}^{n} \frac{\frac{n!}{(n-d)!d!}}{2^n} q\_d(0) \tag{3.27}$$

When hard decisions are used, the probability of each transmitted bit being received in error is given by

$$p\_b = 0.\text{Serfc}\left(\sqrt{\frac{k}{n}\frac{E\_b}{N\_0}}\right) \tag{3.28}$$

Accordingly,

$$p(z) = 1 - p\_b + p\_b z^{-2} \tag{3.29}$$

and *qd* (*z*) for hard decisions becomes

$$q\_d(z) = (1 - p\_b + p\_b z^{-2})^d \left(\frac{1 - z^{d+1}}{1 - z} - 0.5z^d\right) \tag{3.30}$$

giving

$$\overline{p\_{C\_Q}} < \sum\_{d=0}^{n} \frac{\frac{n!}{(n-d)!d!}}{2^n} (1 - p\_b + p\_b z^{-2})^d \left( \frac{1 - z^{d+1}}{1 - z} - 0.5z^d \right) \quad \text{for } z = 0 \quad (3.31)$$

As before, any of the 2*<sup>k</sup>* − 1 matched filters may cause a decoder error, the overall probability of decoder error averaged over all possible binary codes *poverallQ* , is

$$
\overline{p\_{overall\_Q}} < 1 - (1 - \overline{p\_{C\_Q}})^{2^k - 1} < 2^k \overline{p\_{C\_Q}} \tag{3.32}
$$

and

$$\overline{p\_{overall}}\_{Q} < \frac{2^k}{2^n} \sum\_{d=0}^n \frac{n!}{(n-d)!d!} (1 - p\_b + p\_b z^{-2})^d \left( \frac{1 - z^{d+1}}{1 - z} - 0.5 z^d \right), \text{for } z = 0 \tag{3.33}$$

When three-level quantisation is used for the received signal plus noise, a threshold, *vthresh* is defined, whereby, if the magnitude of the received signal plus noise is less than *vthresh* , an erasure is declared otherwise a hard decision is made. The probability of an erasure, *p*erase is given by

$$p\_{\text{erase}} = \frac{2}{\sqrt{\pi N\_0}} \int\_0^{\sqrt{\frac{k}{n}E\_b} - \nu\_{ihrenh}} \mathbf{e}^{\frac{-x^2}{N\_0}} dx \tag{3.34}$$

The probability of a bit error for the hard decision, *pb*, is now given by

$$p\_b = \frac{1}{\sqrt{\pi N\_0}} \int\_{\sqrt{\frac{k}{n}E\_b} + \nu\_{ihresh}}^{\infty} \mathbf{e}^{\frac{-\mathbf{z}^2}{N\_0}} d\mathbf{x} \tag{3.35}$$

Accordingly, *p*(*z*) becomes

$$p(z) = 1 - p\_b - p\_{\text{erase}} + p\_{\text{erase}} z^{-1} + p\_b z^{-2} \tag{3.36}$$

and *qd* (*z*) for three-level soft decisions is

**Fig. 3.1** Optimum threshold <sup>√</sup>*Es* <sup>−</sup> *<sup>y</sup>* <sup>×</sup><sup>σ</sup> with *<sup>y</sup>* <sup>×</sup><sup>σ</sup> plotted as a function of *Es <sup>N</sup>*<sup>0</sup> <sup>=</sup> *<sup>k</sup> n Eb <sup>N</sup>*<sup>0</sup> and *dmin*

$$q\_d(z) = (1 - p\_b - p\_{\text{erase}} + p\_{\text{erase}}z^{-1} + p\_b z^{-2})^d \left(\frac{1 - z^{d+1}}{1 - z} - 0.5z^d\right) \tag{3.37}$$

giving

$$\overline{p\_{overall}}\_{\mathcal{Q}} < \frac{2^k}{2^n} \sum\_{d=0}^n \left( \frac{n!}{(n-d)!d!} (1 - p\_b - p\_{\text{rease}} + p\_{\text{rease}} z^{-1} + p\_b z^{-2})^d$$

$$\left( \frac{1 - z^{d+1}}{1 - z} - 0.5 z^d \right) \right) \quad \text{for } z = 0 \tag{3.38}$$

There is a best choice of *vthresh* which minimises *poverallQ* and this is dependent on the code parameters, (*n*, *k*), and *Eb N*0 . However, *vthresh* is not an unduly sensitive parameter and best values typically range from 0.6 to 0.7σ. The value of 0.65σ is mentioned in Wozencraft and Jacobs [6]. Optimum values of *vthresh* are given in Fig. 3.1.

#### **3.3 Examples**

The overall probability of decoder error averaged over all possible binary codes has been evaluated for *<sup>k</sup> <sup>n</sup>* <sup>=</sup> <sup>1</sup> <sup>2</sup> for soft decisions, using Eq. (3.10), the approximation given by Eq. (3.14) and for hard decisions, using Eq. (3.38), for various code lengths. Results are shown in Fig. 3.2 for the ensemble of (100, 50) binary codes. The difference between the exact random coding bound, Eq. (3.10), and the original, approximate, random coding bound, Eq. (3.14) is about 0.5 dB for (100, 50) codes. The loss due to hard decisions is around 2.1 dB (at 1×10−<sup>5</sup> it is 2.18 dB), and for three-level quantisation is around 1 dB (at 1 × 10−<sup>5</sup> it is 1.03 dB). Also shown in Fig. 3.2 is the sphere packing bound offset by the loss associated with binary transmission.

Results are shown in Fig. 3.3 for the ensemble of (200, 100) binary codes. The difference between the exact random coding bound, Eq. (3.10), and the original, approximate, random coding bound, Eq. (3.14) is about 0.25 dB for (200, 100) codes. The loss due to hard decisions is around 2.1 dB, (at 1 × 10−<sup>5</sup> it is 2.15 dB) and for three-level quantisation is around 1 dB, (at 1 × 10−<sup>5</sup> it is 0.999 dB). Also shown in Fig. 3.3 is the sphere packing bound offset by the loss associated with binary transmission. The exact random coding bound is now much closer to the sphere packing bound, offset by the loss associated with binary transmission, with a gap of about 0.2 dB at 10−8. It should be noted that the sphere packing bound is a lower bound whilst the random binary code bound is an upper bound.

Instead of considering random codes, the effect of soft decision quantisation is analysed for codes with a given weight spectrum. The analysis is restricted to twolevel and three-level quantisation because these are the most common. In other cases, the quantisation is chosen such that near ideal soft decision decoding is realised. The

**Fig. 3.2** Exact and approximate random coding bounds for [100, 50] binary codes and quantised decisions

**Fig. 3.3** Exact and approximate random coding bounds for [200, 100] binary codes and quantised decisions

analysis starts with a hypothetical code in which the Hamming distance between all codewords is the same, *dmin*. The probability of decoder error due to a single matched filter having a greater output than the correct matched filter follows immediately from Eq. (3.4) and the code parameters may be eliminated by considering *Es <sup>N</sup>*<sup>0</sup> instead of *Eb N*0 .

$$p\_d = \frac{1}{2} \text{erfc}\left(\sqrt{d\_{\min}\frac{E\_s}{N\_0}}\right) \tag{3.39}$$

For hard decisions and three-level quantisation, *pd* is given by

$$p\_d = (1 - p\_b - p\_{\text{rease}} + p\_{\text{rease}}z^{-1} + p\_b z^{-2})\_{\text{min}}^d \left( \frac{1 - z^{d\_{\text{min}} + 1}}{1 - z} - 0.5 z^{d\_{\text{min}}} \right), \text{for } z = 0 \tag{3.40}$$

For hard decisions, *p*erase is set equal to zero and *pb* is given by Eq. (3.28). For threelevel quantisation, *p*erase is expressed in terms of *EQs <sup>N</sup>*<sup>0</sup> , the *EQs <sup>N</sup>*<sup>0</sup> ratio required when quantised soft decision decoding is used.

52 3 Soft Decision and Quantised Soft Decision Decoding

$$p\_{\text{erase}} = \frac{2}{\sqrt{\pi N\_0}} \int\_0^{\sqrt{E\_{Qs}} - v\_{threak}} \mathbf{e}^{\frac{-x^2}{N\_0}} d\mathbf{x} \tag{3.41}$$

Similarly, the probability of a bit error for the hard decision, *pb* is given by

$$p\_b = \frac{1}{\sqrt{\pi N\_0}} \int\_{\sqrt{E\_{Qt}} + v\_{bhrab}}^{\infty} \mathbf{e}^{\frac{-\mathbf{x}^2}{N\_0}} d\mathbf{x} \tag{3.42}$$

By equating Eq. (3.39) with Eq. (3.40), the *EQs <sup>N</sup>*<sup>0</sup> required for the same decoder error probability may be determined as a function of *EQs <sup>N</sup>*<sup>0</sup> and *dmin*. The loss, in dB, due to soft decision quantisation may be defined as

$$Loss\_{\mathcal{Q}} = 10 \times \log\_{10} \frac{E\_{\mathcal{Q}s}}{N\_0} - 10 \times \log\_{10} \frac{E\_s}{N\_0} \tag{3.43}$$

Figure 3.4 shows the soft decision quantisation loss, *LossQ*, as a function of *dmin* and *Es <sup>N</sup>*<sup>0</sup> for hard decisions. For low *dmin*, the loss is around 1.5 dB but rises rapidly with *dmin* to around 2 dB. For *Es <sup>N</sup>*<sup>0</sup> = 3 dB, practical systems operate with *dmin* less than 15 or so because the decoder error rate is so very low (at *dmin* = 15, the decoder error rate is less than 1 × 10−20). Most practical systems will operate where the loss is around 2 dB. Low code rate systems ( <sup>1</sup> <sup>3</sup> or less) operate with negative *Es <sup>N</sup>*<sup>0</sup> ratios with *dmin* in the range 25 to 40 whereas <sup>1</sup> <sup>2</sup> code rate systems with *dmin* in the range 20 to

**Fig. 3.4** Loss due to hard decisions as a function of *Es <sup>N</sup>*<sup>0</sup> and *dmin*

**Fig. 3.5** Loss due to three-level soft decisions (erasures) as a function of *Es <sup>N</sup>*<sup>0</sup> and *dmin*

30 will typically operate at *Es <sup>N</sup>*<sup>0</sup> around 0 dB. Of course not all decoder error events are *dmin* events, but the asymptotic nature of the loss produces an average loss of around 2 dB.

Figure 3.5 shows the soft decision quantisation loss, *LossQ*, as a function of *dmin* and *Es <sup>N</sup>*<sup>0</sup> for three-level soft decisions. An optimum threshold has been determined for each value of *dmin* and *Es N*0 , and these threshold values are in terms of <sup>√</sup>*Es* <sup>−</sup> *<sup>y</sup>* <sup>×</sup> <sup>σ</sup> with *y* × σ plotted against *dmin* in Fig. 3.1. Unlike the hard decision case, for threelevel quantisation the lowest loss occurs at high *dmin* values. In common with hard decisions, the lowest loss is for the smallest *Es <sup>N</sup>*<sup>0</sup> values, which are negative when expressed in dB. In absolute terms, the lowest loss is less than 1 dB for *Es <sup>N</sup>*<sup>0</sup> = −3 dB and high *dmin*. This corresponds to low-rate codes with code rates of <sup>1</sup> <sup>3</sup> or <sup>1</sup> <sup>4</sup> . The loss for three-level quantisation is so much better than hard decisions that it is somewhat surprising that three-level quantisation is not found more often in practical systems. The erasure channel is much underrated.

#### **3.4 A Hard Decision Dorsch Decoder and BCH Codes**

The effects of soft decision quantisation on the decoding performance of BCH codes may be explored using the extended Dorsch decoder (see Chap. 15) and by a bounded distance, hard decision decoder, first devised by Peterson [5], refined by Chien [2], Berlekamp [1] and Massey [4]. The extended Dorsch decoder may be used directly on the received three-level quantised soft decisions and of course, on the received unquantised soft decisions. It may also be used on the received hard decisions, to form a near maximum likelihood decoder which is a non bounded distance, hard decision decoder, but requires some modification.

The first stage of the extended Dorsch decoder is to rank the received signal samples in order of likelihood. For hard decisions, all signal samples have equal likelihood and no ranking is possible. However, a random ranking of *k*, independent bits may be substituted for the ranked *k* most reliable, independent bits. Provided the number of bit errors contained in these *k* bits is within the search space of the decoder, the most likely, or the correct codeword, will be found by the decoder. Given the received hard decisions contain *t* errors, and assuming the search space of the decoder can accommodate *m* errors, the probability of finding the correct codeword, or a more likely codeword, *p <sup>f</sup>* is given by

$$p\_f = \sum\_{i=0}^{m} \frac{n!}{(n-i)!} \left(\frac{t}{n}\right)^i \left(1 - \frac{t}{n}\right)^{n-i} \tag{3.44}$$

This probability may be improved by repeatedly carrying out a random ordering of the received samples and running the decoder. With *N* such orderings, the probability of finding the correct codeword, or a more likely codeword, *pN f* becomes more likely and is given by

$$p\_{Nf} = 1 - \left(1 - \sum\_{i=0}^{m} \frac{n!}{(n-i)!} \left(\frac{t}{n}\right)^i \left(1 - \frac{t}{n}\right)^{n-i}\right)^N \tag{3.45}$$

Increasing *N* gives

$$\left(1 - \sum\_{i=0}^{m} \frac{n!}{(n-i)!} \left(\frac{t}{n}\right)^i \left(1 - \frac{t}{n}\right)^{n-i}\right)^N \simeq 0\tag{3.46}$$

and

$$p\_{Nf} \cong 1\tag{3.47}$$

Of course there is a price to be paid because the complexity of the decoder increases with *N*. The parity check matrix needs to be solved *N* times. On the other hand, the size of the search space may be reduced because the repeated decoding allows several chances for the correct codeword to be found.

The modified Dorsch decoder and a bounded distance hard decision BCH decoder have been applied to the [63, 36, 11] BCH code and the simulation results are shown in Fig. 3.6. The decoder search space was set to search 1 × 10<sup>6</sup> codewords for each received vector which ensures that quasi maximum likelihood decoding is obtained. Also shown in Fig. 3.6 is the sphere packing bound for a (63, 36) code offset by

**Fig. 3.6** Soft decision decoding of the (63, 36, 11) BCH code compared to hard decision decoding

the binary transmission loss. As can be seen, the unquantised soft decision decoder produces a performance close to the offset sphere packing bound. The three-level quantisation decoder results are offset approximately 0.9 dB at 1 × 10−<sup>5</sup> from the unquantised soft decision performance. For hard decisions, the modified Dorsch decoder has a performance approximately 2 dB at 1 × 10−<sup>3</sup> from the unquantised soft decision performance and approximately 2.2 dB at 1 × 10−5. Interestingly, this hard decision performance is approximately 0.4 dB better than the bounded distance BCH decoder correcting up to and including 5 errors.

The results for the BCH (127, 92, 11) code are shown in Fig. 3.7. These results are similar to those of the (63, 36, 11) BCH code. At 1 × 10−<sup>5</sup> Frame Error Rate (FER), the unquantised soft decision decoder produces a performance nearly 0.2 dB from the offset sphere packing bound. The three-level quantisation decoder results are offset approximately 1.1 dB at 1 × 10−<sup>5</sup> from the unquantised soft decision performance. This is a higher rate code than the (63, 36, 11) code, and at 1 × 10−<sup>5</sup> the *Es <sup>N</sup>*<sup>0</sup> ratio is 4.1 dB. Figure 3.5 for a *dmin* of 11 and an *Es <sup>N</sup>*<sup>0</sup> ratio of 3 dB indicates a loss of 1.1 dB, giving good agreement to the simulation results. For hard decisions, the modified Dorsch decoder has a performance approximately 2 dB at 1×10−<sup>3</sup> from the unquantised soft decision performance, and approximately 2.1 dB at 1 × 10−5. This is consistent with the theoretical hard decision losses shown in Fig. 3.4. As before, the hard decision performance obtained with the modified Dorsch decoder is better than the bounded distance BCH decoder correcting up to and including five errors, and shows almost 0.5 dB improvement.

**Fig. 3.7** Soft decision decoding of the (127, 92, 11) BCH code compared to hard decision decoding

**Fig. 3.8** Soft decision decoding of the (127, 64, 21) BCH code compared to hard decision decoding

The results for the BCH (127, 64, 21) code are shown in Fig. 3.8. This is an outstanding code, and consequently the unquantised soft decision decoding performance is very close to the offset sphere packing bound, being almost 0.1 dB away from the bound at 1 × 10−5. However, a list size of 107 codewords was used in order to ensure that near maximum likelihood performance was obtained by the modified Dorsch decoder. Similar to before the three-level quantisation decoder results are offset approximately 1.1 dB at 1 × 10−<sup>5</sup> from the unquantised soft decision performance. However, 3×10<sup>7</sup> codewords were necessary in order to obtain near maximum likelihood performance was obtained by the modified Dorsch decoder operating on the three-level quantised decisions. The BCH bounded distance decoder is approximately 3 dB offset from the unquantised soft decision decoding performance and 1 dB from the modified Dorsch decoder operating on the quantised hard decisions.

These simulation results for the losses due to quantisation of the soft decisions show a very close agreement to the losses anticipated from the theoretical analysis.

#### **3.5 Summary**

In this chapter, we derived both approximate and exact bounds on the performance of soft decision decoding compared to hard decision decoding as a function of code parameters. The effects of soft decision quantisation were explored showing the decoding performance loss as a function of number of quantisation levels. Results were presented for the ensembles of all (100, 50) and (200, 100) codes. It was shown that the loss due to quantisation is a function of both *dmin* and SNR. Performance graphs showing the relationship were presented.

It was shown that the near maximum likelihood decoder, the Dorsch decoder described in Chap. 15, may be adapted for hard decision decoding in order to produce better performance than bounded distance decoding. Performance graphs were presented for some BCH codes showing the performance achieved compared to bounded distance decoding.

#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the book's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the book's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Part II Code Construction**

This part of the book deals with the construction of error-correcting codes having good code properties. With an emphasis on binary codes, a wide range of different code constructions are described including cyclic codes, double circulant codes, quadratic residue codes, Goppa codes, Lagrange codes, BCH codes and Reed– Solomon codes. Code combining constructions such as Construction X are also included. For shorter codes, typically less than 512 symbols long, the emphasis is on the highest minimum Hamming distance for a given length and code rate. The construction of some outstanding codes is described in detail together with the derivation of the weight distributions of the codes. For longer codes, the emphasis is on the best code design for a given type of decoder, such as the iterative decoder. Binary convolutional codes are discussed from the point of view of their historical performance in comparison to the performance realised with modern best decoding techniques. Convolutional codes, designed for space communications in the 1960s, are implemented as tail-biting block codes. The performance realised with near maximum likelihood decoding, featuring the modifed Dorsch decoder described in Chap. 15, is somewhat surprising.

# **Chapter 4 Cyclotomic Cosets, the Mattson–Solomon Polynomial, Idempotents and Cyclic Codes**

#### **4.1 Introduction**

Much of the pioneering research on cyclic codes was carried out by Prange [5] in the 1950s and considerably developed by Peterson [4] in terms of generator and paritycheck polynomials. MacWilliams and Sloane [2] showed that cyclic codes could be generated from idempotents and the Mattson–Solomon polynomial, first introduced by Mattson and Solomon in 1961 [3]. The binary idempotent polynomials follow directly from cyclotomic cosets.

#### **4.2 Cyclotomic Cosets**

Consider the expansion of polynomial *a*(*x*) = *m*−1 *<sup>i</sup>*=<sup>0</sup> (*<sup>x</sup>* <sup>−</sup> <sup>α</sup><sup>2</sup>*<sup>i</sup>* ). The coefficients of *a*(*x*) are a cyclotomic coset of powers of α or a sum of cyclotomic cosets of powers of α. For example, if *m* = 4

$$a(\mathbf{x}) = (\mathbf{x} - \alpha)(\mathbf{x} - \alpha^2)(\mathbf{x} - \alpha^4)(\mathbf{x} - \alpha^8) \tag{4.1}$$

and expanding *a*(*x*) produces

$$\begin{split} a(\mathbf{x}) &= \mathbf{x}^4 - (\boldsymbol{\alpha} + \boldsymbol{\alpha}^2 + \boldsymbol{\alpha}^4 + \boldsymbol{\alpha}^8)\mathbf{x}^3 + (\boldsymbol{\alpha}^3 + \boldsymbol{\alpha}^6 + \boldsymbol{\alpha}^{12} + \boldsymbol{\alpha}^9 + \boldsymbol{\alpha}^8 + \boldsymbol{\alpha}^{10})\mathbf{x}^2 \\ &+ (\boldsymbol{\alpha}^7 + \boldsymbol{\alpha}^{14} + \boldsymbol{\alpha}^{13} + \boldsymbol{\alpha}^{11})\mathbf{x} + \boldsymbol{\alpha}^{15}. \end{split} \tag{4.2}$$

**Definition 4.1** (*Cyclotomic Coset*) Let*s* be a positive integer, and the 2−cyclotomic coset of *s* (mod *n*) is given by

© The Author(s) 2017 M. Tomlinson et al., *Error-Correction Coding and Decoding*, Signals and Communication Technology, DOI 10.1007/978-3-319-51103-0\_4

61

$$C\_s = \{ 2^i s \pmod{n} \mid 0 \le i \le t \},$$

where *s* is the smallest element in the set *Cs* and *t* is the smallest positive integer such that 2*<sup>t</sup>*+1*s* ≡ *s* (mod *n*).

For convenience, we will use the term cyclotomic coset to refer to 2−cyclotomic coset. If *N* is the set consisting of the smallest elements of all possible cyclotomic cosets, then it follows that

$$\mathcal{C} = \bigcup\_{s \in \mathcal{A}'} \mathcal{C}\_s = \{0, 1, 2, \dots, n - 1\}.$$

*Example 4.1* The entire cyclotomic cosets of 15 are as follows:

$$\begin{aligned} C\_0 &= \{0\} \\ C\_1 &= \{1, 2, 4, 8\} \\ C\_3 &= \{3, 6, 12, 9\} \\ C\_5 &= \{5, 10\} \\ C\_7 &= \{7, 14, 13, 11\} \end{aligned}$$

and *N* = {0, 1, 3, 5, 7}.

It can be seen that for *GF*(2<sup>4</sup>) above, Eq. (4.2), the coefficients of *a*(*x*) are a cyclotomic coset of powers of α or a sum of cyclotomic cosets of powers of α. For example, the coefficient of *x*<sup>3</sup> is the sum of powers of α from cyclotomic coset *C*1.

In the next step of the argument we note that there is an important property of Galois fields.

**Theorem 4.1** *For a Galois field GF*(*pm*)*, then*

$$\left(b(\mathfrak{x}) + c(\mathfrak{x})\right)^p = b(\mathfrak{x})^p + c(\mathfrak{x})^p.$$

*Proof* Expanding *b*(*x*) + *c*(*x*) *<sup>p</sup>* produces

$$\left(b(\mathbf{x}) + c(\mathbf{x})\right)^p = b(\mathbf{x})^p + \binom{p}{1} b(\mathbf{x})^{p-1} c(\mathbf{x}) + \binom{p}{2} b(\mathbf{x})^{p-2} c(\mathbf{x})^2 + \dotsb \tag{4.3}$$

$$\dots + \binom{p}{p-1} b(\mathbf{x}) c(\mathbf{x})^{p-1} + c(\mathbf{x})^p.$$

As *<sup>p</sup>* modulo *<sup>p</sup>* <sup>=</sup> 0, then all of the binomial coefficients *<sup>p</sup> r* = 0 and

$$\left(b(\alpha) + c(\alpha)\right)^p = b(\alpha)^p + c(\alpha)^p.$$

Another theorem follows.

**Theorem 4.2** *The sum of powers of* α *that are from a cyclotomic coset Ci is equal to either 1 or 0.*

*Proof* The sum of powers of α that are from a cyclotomic coset *Ci* must equal to a field element, some power, *j* of α, α*<sup>j</sup>* or 0. Also, from Theorem 1.1,

$$\left(\sum \alpha^{C\_i}\right)^2 = \sum \alpha^{C\_i}.$$

If the sum of powers of α is non-zero then

$$\left(\sum \alpha^{C\_i}\right)^2 = \alpha^{2j} = \sum \alpha^{C\_i} = \alpha^j.$$

The only non-zero field element that satisfies α<sup>2</sup>*<sup>j</sup>* = α*<sup>j</sup>* is α<sup>0</sup> = 1. Hence, the sum of powers of α that are from a cyclotomic coset *Ci* is equal to either 1 or 0.

In the example of *C*<sup>1</sup> from *GF*(2<sup>4</sup>) we have

$$(\alpha + \alpha^2 + \alpha^4 + \alpha^8)^2 = \alpha^2 + \alpha^4 + \alpha^8 + \alpha^{16} = \alpha^2 + \alpha^4 + \alpha^8 + \alpha^8$$

and so

$$
\alpha + \alpha^2 + \alpha^4 + \alpha^8 = 0 \text{ or } 1... 
$$

Returning to the expansion of polynomial *a*(*x*) = *m*−1 *<sup>i</sup>*=<sup>0</sup> (*<sup>x</sup>* <sup>−</sup> <sup>α</sup><sup>2</sup>*<sup>i</sup>* ). Since the coefficients of *a*(*x*) are a cyclotomic coset of powers of α or a sum of cyclotomic cosets of powers of α, the coefficients of *a*(*x*) must be 0 or 1 and *a*(*x*) must have binary coefficients after noting that the coefficient of *x*<sup>0</sup> is *m*−1 *<sup>i</sup>*=<sup>0</sup> <sup>α</sup><sup>2</sup>*<sup>i</sup>* <sup>=</sup> <sup>α</sup><sup>2</sup>*m*−<sup>1</sup> <sup>=</sup> 1, the maximum order of α. Considering the previous example of *m* = 4 (*GF*(2<sup>4</sup>)), since *a*(*x*) is constrained to have binary coefficients, we have the following possible identities:

$$\begin{aligned} \alpha^{15} &= 1 \\ \alpha + \alpha^2 + \alpha^4 + \alpha^8 &= 0 \text{ or } 1 \\ \alpha^7 + \alpha^{14} + \alpha^{13} + \alpha^{11} &= 0 \text{ or } 1 \\ \alpha^3 + \alpha^6 + \alpha^{12} + \alpha^9 + \alpha^5 + \alpha^{10} &= 0 \text{ or } 1. \end{aligned} \tag{4.4}$$

These identities are determined by the choice of primitive polynomial used to generate the extension field. This can be seen from the Trace function, *Tm*(*x*), defined as

64 4 Cyclotomic Cosets, the Mattson–Solomon Polynomial …

$$T\_m(\mathbf{x}) = \sum\_{i=0}^{m-1} \mathbf{x}^{2^i} \tag{4.5}$$

and expanding the product of *Tm*(*x*) 1 + *Tm*(*x*) produces the identity

$$T\_m(\mathbf{x})\left(1+T\_m(\mathbf{x})\right) = \mathbf{x}(1-\mathbf{x}'').\tag{4.6}$$

α is a root of (1 − *xn*) and so α is a root of either *Tm*(*x*) or 1 + *Tm*(*x*) , and so either *Tm*(α) = 0 or 1 + *Tm*(α) = 0. For *GF*(2<sup>4</sup>)

$$T\_m(\mathbf{x}) = \sum\_{i=0}^{3} \mathbf{x}^{2^i} = \mathbf{x} + \mathbf{x}^2 + \mathbf{x}^4 + \mathbf{x}^8. \tag{4.7}$$

Factorising produces

$$\mathbf{x} + \mathbf{x}^2 + \mathbf{x}^4 + \mathbf{x}^8 = \mathbf{x}(1+\mathbf{x})(1+\mathbf{x}+\mathbf{x}^2)(1+\mathbf{x}+\mathbf{x}^4),\tag{4.8}$$

and

$$1 + T\_m(\mathbf{x}) = 1 + \sum\_{i=0}^{3} \mathbf{x}^{2^i} = 1 + \mathbf{x} + \mathbf{x}^2 + \mathbf{x}^4 + \mathbf{x}^8. \tag{4.9}$$

Factorising produces

$$(1+x+x^2+x^4+x^8=(1+x^3+x^4)(1+x+x^2+x^3+x^4).\tag{4.10}$$

It may be verified that

$$\begin{aligned} T\_m(\mathbf{x}) (1 + T\_m(\mathbf{x})) &= (\mathbf{x} + \mathbf{x}^2 + \mathbf{x}^4 + \mathbf{x}^8)(1 + \mathbf{x} + \mathbf{x}^2 + \mathbf{x}^4 + \mathbf{x}^8) \\ &= \mathbf{x}(1 + \mathbf{x})(1 + \mathbf{x} + \mathbf{x}^2)(1 + \mathbf{x} + \mathbf{x}^4)(1 + \mathbf{x}^3 + \mathbf{x}^4) \\ &\qquad (1 + \mathbf{x} + \mathbf{x}^2 + \mathbf{x}^3 + \mathbf{x}^4) \\ &= \mathbf{x}(1 - \mathbf{x}^{15}). \end{aligned}$$

Consequently, if 1 + *x* + *x*<sup>4</sup> is used to generate the extension field *GF*(16) then α + α<sup>2</sup> + α<sup>4</sup> + α<sup>8</sup> = 0 and if 1 + *x*<sup>3</sup> + *x*<sup>4</sup> is used to generate the extension field *GF*(16), then 1 + α + α<sup>2</sup> + α<sup>4</sup> + α<sup>8</sup> = 0.

Taking the case that *a*(*x*) = 1 + *x* + *x*<sup>4</sup> is used to generate the extension field *GF*(16) by comparing the coefficients given by Eq. (4.2), we can solve the identities of (4.4) after noting that α<sup>5</sup> + α<sup>10</sup> must equal 1 otherwise the order of α is equal to 5, contradicting α being a primitive root. All of the identities of the sum for each cyclotomic coset of powers of α are denoted by *Si m* and these are

$$\begin{aligned} S\_{04} &= \alpha^0 = 1\\ S\_{14} &= \alpha + \alpha^2 + \alpha^4 + \alpha^8 = 0\\ S\_{34} &= \alpha^3 + \alpha^6 + \alpha^{12} + \alpha^9 = 1\\ S\_{54} &= \alpha^5 + \alpha^{10} = 1\\ S\_{74} &= \alpha^7 + \alpha^{14} + \alpha^{13} + \alpha^{11} = 1\\ S\_{154} &= \alpha^{15} = 1. \end{aligned}$$

The lowest degree polynomial that has β as a root is traditionally known as a minimal polynomial [2], and is denoted as *Mi m* where β = α*<sup>i</sup>* . With *Mi m* having binary coefficients

$$M\_{im} = \prod\_{j=0}^{m-1} (\alpha - \alpha^{i2^j}).\tag{4.12}$$

For *GF*(2<sup>4</sup>) and considering *M*3 4 for example,

$$M\_{34} = (\mathbf{x} - \boldsymbol{\alpha}^3)(\mathbf{x} - \boldsymbol{\alpha}^6)(\mathbf{x} - \boldsymbol{\alpha}^{12})(\mathbf{x} - \boldsymbol{\alpha}^9), \tag{4.13}$$

and expanding leads to

$$\begin{split} M\_{3,4} &= \mathbf{x}^4 - (\boldsymbol{\alpha}^3 + \boldsymbol{\alpha}^6 + \boldsymbol{\alpha}^{12} + \boldsymbol{\alpha}^9)\mathbf{x}^3 + (\boldsymbol{\alpha}^9 + \boldsymbol{\alpha}^3 + \boldsymbol{\alpha}^6 + \boldsymbol{\alpha}^{12})\mathbf{x}^2 \\ &+ (\boldsymbol{\alpha}^6 + \boldsymbol{\alpha}^{12} + \boldsymbol{\alpha}^9 + \boldsymbol{\alpha}^3)\mathbf{x} + 1. \end{split} \tag{4.14}$$

It will be noticed that this is the same as Eq. (4.2) with α replaced with α3. Using the identities of Eq. (4.11), it is found that

$$M\_{3.4} = \mathbf{x}^4 + \mathbf{x}^3 + \mathbf{x}^2 + \mathbf{x} + \mathbf{1}.\tag{4.15}$$

Similarly, it is found that for *M*5 4 substitution produces *x*<sup>4</sup> + *x*<sup>2</sup> + 1 which is (*x*<sup>2</sup> + *x* + 1)2, and so

$$M\_{5.4} = \mathbf{x}^2 + \mathbf{x} + 1;\tag{4.16}$$

similarly, it is found that

$$M\_{74} = \mathbf{x}^4 + \mathbf{x}^3 + 1\tag{4.17}$$

for *M*0 4 with β = 15, and substitution produces *x*<sup>4</sup> + 1 = (1 + *x*)<sup>4</sup> and

$$M\_{04} = \mathbf{x} + \mathbf{l}.\tag{4.18}$$

It will be noticed that all of the minimal polynomials correspond to the factors of 1 + *x*<sup>15</sup> given above. Also, it was not necessary to generate a table of *GF*(24) field elements in order to determine all of the minimal polynomials once *M*1 4 was chosen.

A recurrence relation exists for the cyclotomic cosets with increasing *m* for

$$M\_{im+1} = \left(\prod\_{j=0}^{m-1} (\mathbf{x} - \boldsymbol{\alpha}^{i2^j})\right) \mathbf{x} - \boldsymbol{\alpha}^{i2^m}.\tag{4.19}$$

For *m* = 4,

$$M\_{14} = \mathbf{x}^4 + \mathbf{S}\_{14}\mathbf{x}^3 + (\mathbf{S}\_{34} + \mathbf{S}\_{54})\mathbf{x}^2 + \mathbf{S}\_{74}\mathbf{x} + \boldsymbol{\alpha}^{15} \tag{4.20}$$

and so

$$M\_{1.5} = \left(\mathbf{x}^4 + \mathbf{S}\_{14}\mathbf{x}^3 + (\mathbf{S}\_{34} + \mathbf{S}\_{54})\mathbf{x}^2 + \mathbf{S}\_{74}\mathbf{x} + \mathbf{a}^{15}\right)(\mathbf{x} + \mathbf{a}^{16})\tag{4.21}$$

and

$$\begin{aligned} M\_{1\cdot 5} &= \mathbf{x}^{\circ} + (\boldsymbol{\alpha}^{16} + \mathbf{S}\_{1\cdot 4})\mathbf{x}^{4} + (\boldsymbol{\alpha}^{16}\mathbf{S}\_{1\cdot 4} + (\mathbf{S}\_{3\cdot 4} + \mathbf{S}\_{5\cdot 4}))\mathbf{x}^{\circ} \\ &+ \left(\boldsymbol{\alpha}^{16}(\mathbf{S}\_{3\cdot 4} + \mathbf{S}\_{5\cdot 4}) + \mathbf{S}\_{7\cdot 4}\right)\mathbf{x}^{2} + (\boldsymbol{\alpha}^{16}\mathbf{S}\_{7\cdot 4} + \boldsymbol{\alpha}^{15})\mathbf{x} + \boldsymbol{\alpha}^{31} \end{aligned} \tag{4.22}$$

and we find that

$$\begin{split} M\_{1\cdot 5} &= \mathbf{x}^{\mathfrak{s}} + S\_{1\cdot 5} \mathbf{x}^{4} + (S\_{3\cdot 5} + S\_{5\cdot 5}) \mathbf{x}^{3} \\ &+ (S\_{7\cdot 5} + S\_{11\cdot 5}) \mathbf{x}^{2} + S\_{1\cdot 5\cdot 5} \mathbf{x} + \mathbf{a}^{31} . \end{split} \tag{4.23}$$

We have the following identities, linking the cyclotomic cosets of *GF*(2<sup>4</sup>) to *GF*(2<sup>5</sup>)

$$\begin{aligned} S\_{3\cdot 5} + S\_{5\cdot 5} &= \alpha^{16} S\_{1\cdot 4} + S\_{3\cdot 4} + S\_{5\cdot 4} \\ S\_{7\cdot 5} + S\_{11\cdot 5} &= \alpha^{16} (S\_{3\cdot 4} + S\_{5\cdot 4}) + S\_{7\cdot 4} \\ S\_{1\cdot 5\cdot 5} &= \alpha^{16} S\_{7\cdot 4} + \alpha^{15} .\end{aligned}$$

With 1 + *x*<sup>2</sup> + *x*<sup>5</sup> used to generate the extension field *GF*(32), then α + α<sup>2</sup> + α<sup>4</sup> + α<sup>8</sup> + α<sup>16</sup> = 0. Evaluating the cyclotomic cosets of powers of α produces

$$\begin{aligned} S\_{0.5} &= \alpha^0 = 1 \\ S\_{1.5} &= \alpha + \alpha^2 + \alpha^4 + \alpha^8 + \alpha^{16} = 0 \\ S\_{3.5} &= \alpha^3 + \alpha^6 + \alpha^{12} + \alpha^{24} + \alpha^{17} = 1 \\ S\_{5.5} &= \alpha^5 + \alpha^{10} + \alpha^{20} + \alpha^9 + \alpha^{18} = 1 \\ S\_{7.5} &= \alpha^7 + \alpha^{14} + \alpha^{28} + \alpha^{25} + \alpha^{19} = 0 \end{aligned}$$

$$\begin{aligned} S\_{11\cdot 5} &= \alpha^{11} + \alpha^{22} + \alpha^{13} + \alpha^{26} + \alpha^{21} = 1\\ S\_{1\cdot 5} &= \alpha^{15} + \alpha^{30} + \alpha^{29} + \alpha^{27} + \alpha^{23} = 0. \end{aligned} \tag{4.24}$$

Substituting for the minimal polynomials, *Mi*,<sup>5</sup> produces

$$\begin{aligned} M\_{0\mathcal{S}} &= \mathbf{x} + 1\\ M\_{1\mathcal{S}} &= \mathbf{x}^{\mathcal{S}} + \mathbf{x}^2 + 1\\ M\_{3\mathcal{S}} &= \mathbf{x}^{\mathcal{S}} + \mathbf{x}^4 + \mathbf{x}^3 + \mathbf{x}^2 + 1\\ M\_{5\mathcal{S}} &= \mathbf{x}^{\mathcal{S}} + \mathbf{x}^4 + \mathbf{x}^2 + \mathbf{x} + 1\\ M\_{7\mathcal{S}} &= \mathbf{x}^{\mathcal{S}} + \mathbf{x}^3 + \mathbf{x}^2 + \mathbf{x} + 1\\ M\_{11\mathcal{S}} &= \mathbf{x}^{\mathcal{S}} + \mathbf{x}^4 + \mathbf{x}^3 + \mathbf{x} + 1\\ M\_{15\mathcal{S}} &= \mathbf{x}^{\mathcal{S}} + \mathbf{x}^3 + 1. \end{aligned} \tag{4.25}$$

For *GF*(2<sup>5</sup>), the order of a root of a primitive polynomial is 31, a prime number. Moreover, 31 is a Mersenne prime (2*<sup>p</sup>* − 1) and the first 12 Mersenne primes correspond to *p* = 2, 3, 5, 7, 13, 17, 19, 31, 61, 89, 107 and 127. Interestingly, only 49 Mersenne primes are known. The last known Mersenne prime being 2<sup>74207281</sup> − 1, discovered in January 2016. As (2<sup>5</sup> − 1) is prime, each of the minimal polynomials in Eq. (4.25) is primitive.

If α is a root of *Tm*(*x*) and *m* is even, then 1+*T*2*<sup>m</sup>*(*x*) = 1+*Tm*(*x*)+ 1+*Tm*(*x*) 2*m* and <sup>α</sup> <sup>2</sup>2*m*−<sup>1</sup> <sup>2</sup>*m*−<sup>1</sup> is a root of *x*22*<sup>m</sup>* . For example, if α is a root of 1 + *x* + *x*2, α is of order 3 and α<sup>5</sup> is a root of *x* + *x*<sup>2</sup> + *x*<sup>4</sup> + *x*8. Correspondingly, 1 + *x* + *x*<sup>2</sup> is a factor of 1 + *x*<sup>3</sup> and also a factor of1 + *x*<sup>15</sup> and necessarily 2<sup>2</sup>*<sup>m</sup>* − 1 cannot be prime. Similarly, if *m* is not a prime and *m* = *ab*, then

$$\frac{2^m - 1}{2^a - 1} = 2^{b(a-1)} + 2^{b(a-2)} + 2^{b(a-3)} \dots + 1 \tag{4.26}$$

and so

$$2^m - 1 = (2^{b(a-1)} + 2^{b(a-2)} + 2^{b(a-3)} \dots + 1)2^a - 1. \tag{4.27}$$

Similarly

$$2^m - 1 = (2^{a(b-1)} + 2^{a(b-2)} + 2^{a(b-3)} \dots + 1)2^b - 1. \tag{4.28}$$

As a consequence

$$M\_{(2^{b(a-1)} + 2^{b(a-2)} + 2^{b(a-3)} \dots + 1) \times j \, m} = M\_{j \, a} \tag{4.29}$$

for all minimal polynomials of *<sup>x</sup>*2*a*−<sup>1</sup> <sup>−</sup> 1, and

$$M\_{(2^{a(b-1)} + 2^{a(b-2)} + 2^{a(b-3)} \dots + 1) \times j \, m} = M\_{j \, b} \tag{4.30}$$

for all minimal polynomials of *<sup>x</sup>*2*b*−<sup>1</sup> <sup>−</sup> 1.

For *M*1 6, following the same procedure,

$$\begin{split} M\_{16} &= \mathbf{x}^6 + \mathbf{S}\_{16}\mathbf{x}^5 + (\mathbf{S}\_{36} + \mathbf{S}\_{56} + \mathbf{S}\_{96})\mathbf{x}^4 + (\mathbf{S}\_{76} + \mathbf{S}\_{116} + \mathbf{S}\_{136} + \mathbf{S}\_{216})\mathbf{x}^3 \\ &+ (\mathbf{S}\_{156} + \mathbf{S}\_{236} + \mathbf{S}\_{276})\mathbf{x}^2 + \mathbf{S}\_{156}\mathbf{x}^2 + \mathbf{S}\_{316}\mathbf{x} + \mathbf{a}^{63}. \end{split} \tag{4.31}$$

Substituting for the minimal polynomials, *Mi*,<sup>6</sup> produces

$$\begin{aligned} M\_{06} &= x + 1 \\ M\_{16} &= x^6 + x + 1 \\ M\_{36} &= x^6 + x^4 + x^2 + x + 1 \\ M\_{56} &= x^6 + x^5 + x^2 + x + 1 \\ M\_{76} &= x^6 + x^3 + 1 \\ M\_{96} &= x^3 + x^2 + 1 \\ M\_{116} &= x^6 + x^8 + x^3 + x^2 + 1 \\ M\_{136} &= x^6 + x^4 + x^3 + x + 1 \\ M\_{156} &= x^6 + x^5 + x^4 + x^2 + 1 \\ M\_{216} &= x^2 + x + 1 \\ M\_{216} &= x^6 + x^5 + x^4 + x + 1 \\ M\_{276} &= x^3 + x + 1 \\ M\_{216} &= x^6 + x^5 + 1. \end{aligned}$$

Notice that *M*9 6 = *M*3 4 because α<sup>9</sup> + α<sup>18</sup> + α<sup>36</sup> = 1 and *M*27 6 = *M*1 4 because α<sup>9</sup> +α<sup>18</sup> +α<sup>36</sup> = 0. *M*21 6 = *M*1 3 because α<sup>21</sup> +α<sup>42</sup> = 1. The order of α is 63 which factorises to 7 × 3 × 3 and so *x*<sup>63</sup> − 1 will have roots of order 7 (α9) and roots of order 3 (α21). Another way of looking at this is the factorisation of *x*<sup>63</sup> − 1. *x*<sup>7</sup> − 1 is a factor and *x*<sup>3</sup> − 1 is a factor

$$\begin{aligned} \mathbf{x}^{63} - \mathbf{l} &= (\mathbf{x}^{7} - \mathbf{l})(\mathbf{l} + \mathbf{x}^{7} + \mathbf{x}^{14} + \mathbf{x}^{21} \\ &+ \mathbf{x}^{28} + \mathbf{x}^{35} + \mathbf{x}^{42} + \mathbf{x}^{49} + \mathbf{x}^{96}) \end{aligned} \tag{4.33}$$

also

$$\begin{array}{c} \text{x}^{63} - 1 = (\text{x}^3 - 1)(1 + \text{x}^3 + \text{x}^6 + \text{x}^9 + \text{x}^{12} + \text{x}^{15} + \text{x}^{18} + \text{x}^{21} \\ \quad + \text{x}^{24} + \text{x}^{27} + \text{x}^{30} + \text{x}^{33} + \text{x}^{36} + \text{x}^{42} + \text{x}^{45} \\ \quad + \text{x}^{48} + \text{x}^{51} + \text{x}^{54} + \text{x}^{57} + \text{x}^{60} \end{array} \tag{4.34}$$

and

$$\begin{aligned} x^3 - 1 &= (\mathbf{x} + 1)(\mathbf{x}^2 + \mathbf{x} + 1) \\ x^7 - 1 &= (\mathbf{x} + 1)(\mathbf{x}^3 + \mathbf{x} + 1)(\mathbf{x}^3 + \mathbf{x}^2 + 1) \\ x^{63} - 1 &= (\mathbf{x} + 1)(\mathbf{x}^2 + \mathbf{x} + 1)(\mathbf{x}^3 + \mathbf{x} + 1)(\mathbf{x}^3 + \mathbf{x}^2 + 1)(\mathbf{x}^6 + \mathbf{x} + 1) \\ &= (\mathbf{x}^6 + \mathbf{x}^4 + \mathbf{x}^2 + \mathbf{x} + 1) \dots (\mathbf{x}^6 + \mathbf{x}^5 + 1). \end{aligned} \tag{4.35}$$

For *M*1 7

$$\begin{aligned} M\_{1}\boldsymbol{\gamma} &= \mathbf{x}^{7} + \mathbf{S}\_{1}\boldsymbol{\gamma}\mathbf{x}^{6} + (\mathbf{S}\_{3}\boldsymbol{\gamma} + \mathbf{S}\_{5}\boldsymbol{\gamma} + \mathbf{S}\_{9}\boldsymbol{\gamma})\mathbf{x}^{4} + (\mathbf{S}\_{7}\boldsymbol{\gamma} + \mathbf{S}\_{11}\boldsymbol{\gamma} + \mathbf{S}\_{13}\boldsymbol{\gamma} + \mathbf{S}\_{21}\boldsymbol{\gamma})\mathbf{x}^{3} \\ &+ (\mathbf{S}\_{15}\boldsymbol{\gamma} + \mathbf{S}\_{23}\boldsymbol{\gamma} + \mathbf{S}\_{27}\boldsymbol{\gamma} + \mathbf{S}\_{29}\boldsymbol{\gamma})\mathbf{x}^{3} + (\mathbf{S}\_{15}\boldsymbol{\gamma} + \mathbf{S}\_{31}\boldsymbol{\gamma} + \mathbf{S}\_{43}\boldsymbol{\gamma} + \mathbf{S}\_{47}\boldsymbol{\gamma} + \mathbf{S}\_{55}\boldsymbol{\gamma})\mathbf{x}^{2} \\ &\quad + \mathbf{S}\_{63}\boldsymbol{\gamma}\mathbf{x} + a^{127}. \end{aligned} \tag{4.36}$$

Although the above procedure using the sums of powers of α from the cyclotomic cosets may be used to generate the minimal polynomials *Mi m* for any *m*, the procedure becomes tedious with increasing *m*, and it is easier to use the Mattson Polynomial or combinations of the idempotents as described in Sect. 4.4.

#### **4.3 The Mattson–Solomon Polynomial**

The Mattson–Solomon polynomial is very useful for it can be conveniently used to generate minimal polynomials and idempotents. It also may be used to design cyclic codes, RS codes and Goppa codes as well as determining the weight distribution of codes. The Mattson–Solomon polynomial [2] of a polynomial *a*(*x*) is a linear transformation of *a*(*x*) to *A*(*z*). The Mattson–Solomon polynomial is the same as the inverse Discrete Fourier Transform over a finite field. The polynomial variables *x* and *z* are used to distinguish the polynomials in either domain.

Let the splitting field of *<sup>x</sup><sup>n</sup>* <sup>−</sup> 1 over <sup>F</sup><sup>2</sup> be <sup>F</sup>2*<sup>m</sup>* , where *<sup>n</sup>* is an odd integer and *<sup>m</sup>* <sup>&</sup>gt; 1, and let a generator of <sup>F</sup>2*<sup>m</sup>* be <sup>α</sup> and an integer *<sup>r</sup>* <sup>=</sup> (2*<sup>m</sup>* <sup>−</sup> <sup>1</sup>)/*n*. Let *<sup>a</sup>*(*x*) be a polynomial of degree at most *<sup>n</sup>* <sup>−</sup> 1 with coefficients over <sup>F</sup>2*<sup>m</sup>* .

**Definition 4.2** (*Mattson–Solomon polynomial*) The Mattson–Solomon polynomial of *a*(*x*) is the linear transformation of *a*(*x*) to *A*(*z*) and is defined by [2]

$$A(z) = \text{MS}(a(\mathbf{x})) = \sum\_{j=0}^{n-1} a(\alpha^{-rj}) z^j. \tag{4.37}$$

The inverse Mattson–Solomon transformation or Fourier transform is

**Table 4.1** *GF*(16) extension field defined by 1 <sup>+</sup> <sup>α</sup> <sup>+</sup> <sup>α</sup><sup>4</sup> <sup>=</sup> <sup>0</sup>

<sup>α</sup><sup>0</sup> <sup>=</sup> <sup>1</sup> <sup>α</sup><sup>1</sup> <sup>=</sup> <sup>α</sup> <sup>α</sup><sup>2</sup> <sup>=</sup> <sup>α</sup><sup>2</sup> <sup>α</sup><sup>3</sup> <sup>=</sup> <sup>α</sup><sup>3</sup> <sup>α</sup><sup>4</sup> <sup>=</sup> <sup>1</sup> <sup>+</sup> <sup>α</sup> <sup>α</sup><sup>5</sup> <sup>=</sup> <sup>α</sup> <sup>+</sup> <sup>α</sup><sup>2</sup> <sup>α</sup><sup>6</sup> <sup>=</sup> <sup>α</sup><sup>2</sup> <sup>+</sup> <sup>α</sup><sup>3</sup> <sup>α</sup><sup>7</sup> <sup>=</sup> <sup>1</sup> <sup>+</sup> <sup>α</sup> <sup>+</sup> <sup>α</sup><sup>3</sup> <sup>α</sup><sup>8</sup> <sup>=</sup> <sup>1</sup> <sup>+</sup> <sup>α</sup><sup>2</sup> <sup>α</sup><sup>9</sup> <sup>=</sup> <sup>α</sup> <sup>+</sup> <sup>α</sup><sup>3</sup> <sup>α</sup><sup>10</sup> <sup>=</sup> <sup>1</sup> <sup>+</sup> <sup>α</sup> <sup>+</sup> <sup>α</sup><sup>2</sup> <sup>α</sup><sup>11</sup> <sup>=</sup> <sup>α</sup> <sup>+</sup> <sup>α</sup><sup>2</sup> <sup>+</sup> <sup>α</sup><sup>3</sup> <sup>α</sup><sup>12</sup> <sup>=</sup> <sup>1</sup> <sup>+</sup> <sup>α</sup> <sup>+</sup> <sup>α</sup><sup>2</sup> <sup>+</sup> <sup>α</sup><sup>3</sup> <sup>α</sup><sup>13</sup> <sup>=</sup> <sup>1</sup> <sup>+</sup> <sup>α</sup><sup>2</sup> <sup>+</sup> <sup>α</sup><sup>3</sup> <sup>α</sup><sup>14</sup> <sup>=</sup> <sup>1</sup> <sup>+</sup> <sup>α</sup><sup>3</sup>

$$a(\mathbf{x}) = \mathbf{M} \mathbf{S}^{-1}(A(\mathbf{z})) = \frac{1}{n} \sum\_{i=0}^{n-1} A(\alpha^{ri}) \mathbf{x}^{i}. \tag{4.38}$$

The integer *r* comes into play when 2*<sup>m</sup>* − 1 is not a prime, that is, 2*<sup>m</sup>* − 1 is not a Mersenne prime, otherwise *<sup>r</sup>* <sup>=</sup> 1. As an example, we will consider <sup>F</sup><sup>24</sup> and the extension field table of non-zero elements is given in Table 4.1 with 1 + α + α<sup>4</sup> = 0, modulo 1 + *x*15.

Consider the polynomial *a*(*x*) denoted as

$$a(\mathbf{x}) = \sum\_{i=0}^{n-1} a\_i \mathbf{x}^i = 1 + \mathbf{x}^3 + \mathbf{x}^4. \tag{4.39}$$

We will evaluate the Mattson–Solomon polynomial coefficient by coefficient:

$$\begin{aligned} A(0) &= a\_0 + a\_3 + a\_4 = 1 + 1 + 1 = 1 \\ A(1) &= a\_0 + a\_3 a^{-3} + a\_4 a^{-4} = 1 + a^{12} + a^{11} = 1 + 1 + a + a^2 + a^3 + a + a^2 + a^3 = 0 \\ A(2) &= a a + a a^{-6} + a a a^{-8} = 1 + a^9 + a^7 = 1 + a + a^3 + 1 + a + a^3 = 0 \\ A(3) &= a\_0 + a\_3 a^{-9} + a a^{-12} = 1 + a^6 + a^3 = 1 + a^2 + a^3 + a^3 = a^8 \\ A(4) &= a\_0 + a\_3 a^{-12} + a\_4 a^{-16} = 1 + a^3 + a^{14} = 1 + a^3 + 1 + a^3 = 0 \\ A(5) &= a\_0 + a\_3 a^{-15} + a\_4 a^{-20} = 1 + 1 + a^{10} = a^{10} \\ A(6) &= a\_0 + a\_3 a^{-18} + a\_4 a^{-24} = 1 + a^{12} + a^6 = a \\ A(7) &= a\_0 + a\_3 a^{-21} + a\_4 a^{-28} = 1 + a^9 + a^2 = 1 + a + a^3 + a^2 = a^{12} \\ A(8) &= a\_0 + a\_3 a^{-24} + a a^{-32} = 1 + a^6 + a^{13} = 0 \end{aligned}$$

$$A(9) = a\_0 + a\_3 a^{-27} + a\_4 a^{-36} = 1 + a^3 + a^9 = 1 + a = a^4$$

$$A(10) = a\_0 + a\_3 a^{-30} + a\_4 a^{-40} = 1 + 1 + a^5 = a^5$$

$$A(11) = a\_0 + a\_3 a^{-33} + a\_4 a^{-44} = 1 + a^{12} + a = a^6$$

$$A(12) = a\_0 + a\_3 a^{-36} + a\_4 a^{-48} = 1 + a^9 + a^{12} = a^2$$

$$A(13) = a\_0 + a\_3 a^{-39} + a\_4 a^{-52} = 1 + a^6 + a^8 = a^3$$

$$A(14) = a\_0 + a\_3 a^{-42} + a\_4 a^{-56} = 1 + a^3 + a^4 = a^9. \tag{4.40}$$

It can be seen that *A*(*z*) is

$$\begin{split} A(z) &= 1 + \alpha^8 z^3 + \alpha^{10} z^5 + \alpha z^6 + \alpha^{12} z^7 + \alpha^4 z^9 + \alpha^5 z^{(10)} + \alpha^6 z^{11} + \alpha^2 z^{12} \\ &+ \alpha^3 z^{13} + \alpha^9 z^{14} .\end{split}$$

A(z) has four zeros corresponding to the roots α−1, α−2, α−<sup>4</sup> and α−8, and these are the roots of 1 + *x*<sup>3</sup> + *x*4. These are also 4 of the 15 roots of 1 + *x*15. Factorising 1 + *x*<sup>15</sup> produces the identity

$$1 + x^{15} = (1+x)(1+x+x^2)(1+x+x^4)(1+x^3+x^4)(1+x+x^2+x^3+x^4).\tag{4.41}$$

It can be seen that 1 + *x*<sup>3</sup> + *x*<sup>4</sup> is one of the factors of 1 + *x*15.

Another point to notice is that *A*(*z*) = *A*(*z*)<sup>2</sup> and *A*(*z*)is an idempotent. The reason for this is that the inverse Mattson–Solomon polynomial of *A*(*z*) will produce *a*(*x*) a polynomial that has binary coefficients. Let · denote the dot product of polynomials, i.e.

$$\left(\sum A\_i z^i\right) \cdot \left(\sum B\_i z^i\right) = \sum A\_i B\_i z^i.$$

It follows from the Mattson–Solomon polynomial that with *<sup>a</sup>*(*x*)*b*(*x*) <sup>=</sup> *<sup>c</sup>*(*x*), *Ciz<sup>i</sup>* = *AiBiz<sup>i</sup>* .

This concept is analogous to multiplication and convolution in the time and frequency domains, where the Fourier and inverse Fourier transforms correspond to the inverse Mattson–Solomon and Mattson–Solomon polynomials, respectively. In the above example, *A*(*z*) is an idempotent which leads to the following lemma.

**Lemma 4.1** *The Mattson–Solomon polynomial of a polynomial having binary coefficients is an idempotent.*

*Proof* Let *c*(*x*) = *a*(*x*) · *b*(*x*). The Mattson–Solomon polynomial of *c*(*x*) is *C*(*z*) = *A*(*z*)*B*(*z*). Setting *b*(*x*) = *a*(*x*) then *C*(*z*) = *A*(*z*)*A*(*z*) = *A*(*z*)2. If *a*(*x*) has binary coefficients, then *c*(*x*) = *a*(*x*) · *a*(*x*) = *a*(*x*) and *A*(*z*)<sup>2</sup> = *A*(*z*). Therefore *A*(*z*) is an idempotent.

Of course the reverse is true.

**Lemma 4.2** *The Mattson–Solomon polynomial of an idempotent is a polynomial having binary coefficients.*

*Proof* Let *c*(*x*) = *a*(*x*)*b*(*x*). The Mattson–Solomon polynomial of *c*(*x*) is *C*(*z*) = *A*(*z*)*B*(*z*). Setting *b*(*x*) = *a*(*x*) then *C*(*z*) = *A*(*z*)· *A*(*z*). If *a*(*x*) is an idempotent then *c*(*x*) = *a*(*x*)<sup>2</sup> = *a*(*x*) and *A*(*z*) = *A*(*z*) · *A*(*z*). The only values for the coefficients of *A*(*z*) that satisfy this constraint are the values 0 and 1. Hence, the Mattson Solomon polynomial, *A*(*z*), has binary coefficients.

A polynomial that has binary coefficients and is an idempotent is a binary idempotent, and combining Lemmas 4.1 and 4.2 produces the following lemma.

**Lemma 4.3** *The Mattson–Solomon polynomial of a binary idempotent is also a binary idempotent.*

*Proof* The proof follows immediately from the proofs of Lemmas 4.1 and 4.2. As *a*(*x*) is an idempotent, then from Lemma 4.1, *A*(*z*) has binary coefficients. As *a*(*x*) also has binary coefficients, then from Lemma 4.2, *A*(*z*) is an idempotent. Hence, *A*(*z*) is a binary idempotent.

As an example consider the binary idempotent *a*(*x*) from *GF*(16) listed in Table 4.1:

$$a(\mathbf{x}) = \mathbf{x} + \mathbf{x}^2 + \mathbf{x}^3 + \mathbf{x}^4 + \mathbf{x}^6 + \mathbf{x}^8 + \mathbf{x}^9 + \mathbf{x}^{12}.$$

The Mattson–Solomon polynomial *A*(*z*) is

$$A(z) = z^{\mathcal{I}} + z^{11} + z^{13} + z^{14},$$

which is also a binary idempotent.

Since the Mattson polynomial of *a*(*x*−<sup>1</sup>) is the same as the inverse Mattson polynomial of *a*(*x*) consider the following example:

$$a(\mathbf{x}) = \mathbf{x}^{-\mathcal{T}} + \mathbf{x}^{-11} + \mathbf{x}^{-13} + \mathbf{x}^{-14} = \mathbf{x} + \mathbf{x}^2 + \mathbf{x}^4 + \mathbf{x}^4.$$

The Mattson–Solomon polynomial *A*(*z*) is the binary idempotent

$$A(z) = z + z^2 + z^3 + z^4 + z^6 + z^8 + z^9 + z^{12}.$$

This is the reverse of the first example above.

The polynomial 1 + *x* + *x*<sup>3</sup> has no roots of 1 + *x*<sup>15</sup> and so defining *b*(*x*)

$$b(\mathbf{x}) = (\mathbf{l} + \mathbf{x} + \mathbf{x}^3)(\mathbf{l} + \mathbf{x}^3 + \mathbf{x}^4) = \mathbf{l} + \mathbf{x} + \mathbf{x}^5 + \mathbf{x}^6 + \mathbf{x}^7. \tag{4.42}$$

When the Mattson–Solomon polynomial is evaluated, *B*(*z*) is given by

$$B(z) = 1 + z + z^{\mathfrak{s}} + z^{\mathfrak{e}} + z^{\mathfrak{I}}.\tag{4.43}$$

#### **4.4 Binary Cyclic Codes Derived from Idempotents**

In their book, MacWilliams and Sloane [2] describe the Mattson–Solomon polynomial and show that cyclic codes may be constructed straightforwardly from idempotents. An idempotent is a polynomial θ (*x*) with coefficients from a base field *GF*(*p*) that has the property that θ *<sup>p</sup>*(*x*) = θ (*x*). The family of Bose–Chaudhuri– Hocquenghem (BCH) cyclic codes may be constructed directly from the Mattson– Solomon polynomial. From the idempotents, other cyclic codes may be constructed which have low-weight dual-code codewords or equivalently sparseness of the paritycheck matrix (see Chap. 12).

**Definition 4.3** (*Binary Idempotent*) Consider *e*(*x*) ∈ *T*(*x*), *e*(*x*) is an idempotent if the property of *e*(*x*) = *e*<sup>2</sup>(*x*) = *e*(*x*<sup>2</sup>) mod (*x<sup>n</sup>* − 1) is satisfied.

An (*n*, *k*) binary cyclic code may be described by the generator polynomial *g*(*x*) ∈ *T*(*x*) of degree *n* − *k* and the parity-check polynomial *h*(*x*) ∈ *T*(*x*) of degree *k*, such that *g*(*x*)*h*(*x*) = *xn*−1. According to [2], as an alternative to *g*(*x*), an idempotent may also be used to generate cyclic codes. Any binary cyclic code can be described by a unique idempotent *eg*(*x*) ∈ *T*(*x*) which consists of a sum of primitive idempotents. The unique idempotent *eg*(*x*) is known as the *generating idempotent* and as the name implies, *g*(*x*) is a divisor of *eg*(*x*), and to be more specific *eg*(*x*) = *m*(*x*)*g*(*x*), where *m*(*x*) ∈ *T*(*x*) contains repeated factors or non-factors of *x<sup>n</sup>* − 1.

**Lemma 4.4** *If e*(*x*) ∈ *T*(*x*) *is an idempotent, E*(*z*) = *MS*(*e*(*x*)) ∈ *T*(*z*)*.*

*Proof* Since *e*(*x*) = *e*(*x*)<sup>2</sup> (mod *x<sup>n</sup>* − 1), from (4.37) it follows that *e*(α−*rj*) = *e*(α−*rj*)<sup>2</sup> for *j* = {0, 1,..., *n* − 1} and some integer *r*. Clearly *e*(α−*rj*) ∈ {0, 1} implying that *E*(*z*) is a binary polynomial.

**Definition 4.4** (*Cyclotomic Coset*) Let*s* be a positive integer, and the 2−cyclotomic coset of *s* (mod *n*) is given by

$$\mathcal{C}\_s = \left\{ 2^i s \pmod{n} \mid 0 \le i \le t \right\},$$

where we shall always assume that the subscript *s* is the smallest element in the set *Cs* and *t* is the smallest positive integer such that 2*<sup>t</sup>*+<sup>1</sup>*s* ≡ *s* (mod *n*).

For convenience, we will use the term cyclotomic coset to refer to 2−cyclotomic coset throughout this book. If *N* is the set consisting of the smallest elements of all possible cyclotomic cosets, then it follows that

$$C = \bigcup\_{\mathfrak{s} \in \mathcal{A}'} C\_{\mathfrak{s}} = \{0, 1, 2, \dots, n - 1\}.$$

**Definition 4.5** (*Binary Cyclotomic Idempotent*) Let the polynomial *es*(*x*) ∈ *T*(*x*) be given by

74 4 Cyclotomic Cosets, the Mattson–Solomon Polynomial …

$$e\_s(\mathbf{x}) = \sum\_{0 \le i \le |C\_x| - 1} \mathbf{x}^{C\_{x,i}},\tag{4.44}$$

where |*Cs*| is the number of elements in *Cs* and *Cs*,*<sup>i</sup>* = 2*<sup>i</sup> s* (mod *n*), the (*i* + 1)th element of *Cs*. The polynomial *es*(*x*) is called a binary cyclotomic idempotent.

*Example 4.2* The entire cyclotomic cosets of 63 and their corresponding binary cyclotomic idempotents are as follows:

*C*<sup>0</sup> = {0} *e*0(*x*) = 1 *<sup>C</sup>*<sup>1</sup> = {1, <sup>2</sup>, <sup>4</sup>, <sup>8</sup>, <sup>16</sup>, <sup>32</sup>} *<sup>e</sup>*1(*x*) <sup>=</sup> *<sup>x</sup>* <sup>+</sup> *<sup>x</sup>*<sup>2</sup> <sup>+</sup> *<sup>x</sup>*<sup>4</sup> <sup>+</sup> *<sup>x</sup>*<sup>8</sup> <sup>+</sup> *<sup>x</sup>*<sup>16</sup> <sup>+</sup> *<sup>x</sup>*<sup>32</sup> *<sup>C</sup>*<sup>3</sup> = {3, <sup>6</sup>, <sup>12</sup>, <sup>24</sup>, <sup>48</sup>, <sup>33</sup>} *<sup>e</sup>*3(*x*) <sup>=</sup> *<sup>x</sup>*<sup>3</sup> <sup>+</sup> *<sup>x</sup>*<sup>6</sup> <sup>+</sup> *<sup>x</sup>*<sup>12</sup> <sup>+</sup> *<sup>x</sup>*<sup>24</sup> <sup>+</sup> *<sup>x</sup>*<sup>33</sup> <sup>+</sup> *<sup>x</sup>*<sup>48</sup> *<sup>C</sup>*<sup>5</sup> = {5, <sup>10</sup>, <sup>20</sup>, <sup>40</sup>, <sup>17</sup>, <sup>34</sup>} *<sup>e</sup>*5(*x*) <sup>=</sup> *<sup>x</sup>*<sup>5</sup> <sup>+</sup> *<sup>x</sup>*<sup>10</sup> <sup>+</sup> *<sup>x</sup>*<sup>17</sup> <sup>+</sup> *<sup>x</sup>*<sup>20</sup> <sup>+</sup> *<sup>x</sup>*<sup>34</sup> <sup>+</sup> *<sup>x</sup>*<sup>40</sup> *<sup>C</sup>*<sup>7</sup> = {7, <sup>14</sup>, <sup>28</sup>, <sup>56</sup>, <sup>49</sup>, <sup>35</sup>} *<sup>e</sup>*7(*x*) <sup>=</sup> *<sup>x</sup>*<sup>7</sup> <sup>+</sup> *<sup>x</sup>*<sup>14</sup> <sup>+</sup> *<sup>x</sup>*<sup>28</sup> <sup>+</sup> *<sup>x</sup>*<sup>35</sup> <sup>+</sup> *<sup>x</sup>*<sup>49</sup> <sup>+</sup> *<sup>x</sup>*<sup>56</sup> *<sup>C</sup>*<sup>9</sup> = {9, <sup>18</sup>, <sup>36</sup>} *<sup>e</sup>*9(*x*) <sup>=</sup> *<sup>x</sup>*<sup>9</sup> <sup>+</sup> *<sup>x</sup>*<sup>18</sup> <sup>+</sup> *<sup>x</sup>*<sup>36</sup> *<sup>C</sup>*<sup>11</sup> = {11, <sup>22</sup>, <sup>44</sup>, <sup>25</sup>, <sup>50</sup>, <sup>37</sup>} *<sup>e</sup>*11(*x*) <sup>=</sup> *<sup>x</sup>*<sup>11</sup> <sup>+</sup> *<sup>x</sup>*<sup>22</sup> <sup>+</sup> *<sup>x</sup>*<sup>25</sup> <sup>+</sup> *<sup>x</sup>*<sup>37</sup> <sup>+</sup> *<sup>x</sup>*<sup>44</sup> <sup>+</sup> *<sup>x</sup>*<sup>50</sup> *<sup>C</sup>*<sup>13</sup> = {13, <sup>26</sup>, <sup>52</sup>, <sup>41</sup>, <sup>19</sup>, <sup>38</sup>} *<sup>e</sup>*13(*x*) <sup>=</sup> *<sup>x</sup>*<sup>13</sup> <sup>+</sup> *<sup>x</sup>*<sup>19</sup> <sup>+</sup> *<sup>x</sup>*<sup>26</sup> <sup>+</sup> *<sup>x</sup>*<sup>38</sup> <sup>+</sup> *<sup>x</sup>*<sup>41</sup> <sup>+</sup> *<sup>x</sup>*<sup>52</sup> *<sup>C</sup>*<sup>15</sup> = {15, <sup>30</sup>, <sup>60</sup>, <sup>57</sup>, <sup>51</sup>, <sup>39</sup>} *<sup>e</sup>*15(*x*) <sup>=</sup> *<sup>x</sup>*<sup>15</sup> <sup>+</sup> *<sup>x</sup>*<sup>30</sup> <sup>+</sup> *<sup>x</sup>*<sup>39</sup> <sup>+</sup> *<sup>x</sup>*<sup>51</sup> <sup>+</sup> *<sup>x</sup>*<sup>57</sup> <sup>+</sup> *<sup>x</sup>*<sup>60</sup> *<sup>C</sup>*<sup>21</sup> = {21, <sup>42</sup>} *<sup>e</sup>*21(*x*) <sup>=</sup> *<sup>x</sup>*<sup>21</sup> <sup>+</sup> *<sup>x</sup>*<sup>42</sup> *<sup>C</sup>*<sup>23</sup> = {23, <sup>46</sup>, <sup>29</sup>, <sup>58</sup>, <sup>53</sup>, <sup>43</sup>} *<sup>e</sup>*23(*x*) <sup>=</sup> *<sup>x</sup>*<sup>23</sup> <sup>+</sup> *<sup>x</sup>*<sup>29</sup> <sup>+</sup> *<sup>x</sup>*<sup>43</sup> <sup>+</sup> *<sup>x</sup>*<sup>46</sup> <sup>+</sup> *<sup>x</sup>*<sup>53</sup> <sup>+</sup> *<sup>x</sup>*<sup>58</sup> *<sup>C</sup>*<sup>27</sup> = {27, <sup>54</sup>, <sup>45</sup>} *<sup>e</sup>*27(*x*) <sup>=</sup> *<sup>x</sup>*<sup>27</sup> <sup>+</sup> *<sup>x</sup>*<sup>45</sup> <sup>+</sup> *<sup>x</sup>*<sup>54</sup> *<sup>C</sup>*<sup>31</sup> = {31, <sup>62</sup>, <sup>61</sup>, <sup>59</sup>, <sup>55</sup>, <sup>47</sup>} *<sup>e</sup>*31(*x*) <sup>=</sup> *<sup>x</sup>*<sup>31</sup> <sup>+</sup> *<sup>x</sup>*<sup>47</sup> <sup>+</sup> *<sup>x</sup>*<sup>55</sup> <sup>+</sup> *<sup>x</sup>*<sup>59</sup> <sup>+</sup> *<sup>x</sup>*<sup>61</sup> <sup>+</sup> *<sup>x</sup>*<sup>62</sup>

and *N* = {0, 1, 3, 5, 7, 9, 11, 13, 15, 21, 23, 27, 31}.

**Definition 4.6** (*Binary Parity-Check Idempotent*) Let *M* ⊆ *N* and let the polynomial *u*(*x*) ∈ *T*(*x*) be defined by

$$u(\mathbf{x}) = \sum\_{\mathbf{s} \in \mathcal{A}} e\_{\mathbf{s}}(\mathbf{x}),\tag{4.45}$$

where *es*(*x*) is an idempotent. The polynomial *u*(*x*) is called a binary parity-check idempotent.

The binary parity-check idempotent *u*(*x*) can be used to describe an [*n*, *k*] cyclic code. Since GCD(*u*(*x*), *x<sup>n</sup>*−1) = *h*(*x*), the polynomial *u*¯(*x*) = *x*deg(*u*(*x*))*u*(*x*−<sup>1</sup>) and its *n* cyclic shifts (mod *x<sup>n</sup>*−1) can be used to define the parity-check matrix of a binary cyclic code. In general, wt*<sup>H</sup>* (*u*¯(*x*)) is much lower than wt*<sup>H</sup>* (*h*(*x*)), and therefore a sparse parity-check matrix can be derived from *u*¯(*x*). This is important for cyclic codes designed to be used as low-density parity-check (LDPC) codes, see Chap. 12.

#### *4.4.1 Non-Primitive Cyclic Codes Derived from Idempotents*

The factors of 2*<sup>m</sup>* − 1 dictate the degrees of the minimal polynomials through the order of the cyclotomic cosets. Some relatively short non-primitive cyclic codes have minimal polynomials of high degree which makes it tedious to derive the generator polynomial or parity-check polynomial using the Mattson–Solomon polynomial. The prime factors of 2*<sup>m</sup>* − 1 for *m* ≤ 43 are tabulated below in Table 4.2.

The Mersenne primes shown in Table 4.2 are 2<sup>3</sup> − 1, 2<sup>5</sup> − 1, 2<sup>7</sup> − 1, 2<sup>13</sup> − 1, 2<sup>17</sup> − 1, 2<sup>19</sup> − 1, 2<sup>23</sup> − 1 and 2<sup>31</sup> − 1, and cyclic codes of these lengths are primitive cyclic codes. Non-primitive cyclic codes have lengths corresponding to factors of 2*<sup>m</sup>* − 1 which are not Mersenne primes. Also it may be seen in Table 4.2 that for *m* even, 3 is a common factor. Where *m* is congruent to 5, with *m* = 5 × *s*, 31 is a common factor and all *Mj* <sup>5</sup> minimal polynomials will be contained in the set, *Mj* <sup>5</sup>×*<sup>s</sup>* of minimal polynomials.

As an example of how useful Table 4.2 can be, consider a code of length 113. Table 4.2 shows that 2<sup>28</sup> − 1 contains 113 as a factor. This means that there is a polynomial of degree 28 that has a root β of order 113. In fact, β = α2375535, where α is a primitive root, because 2<sup>28</sup> − 1 = 2375535 × 113.

The cyclotomic cosets of 113 are as follows:

$$\begin{aligned} C\_0 &= \{0\} \\ C\_1 &= \{1, 2, 4, 8, 16, 32, 64, 15, 30, 60, 7, 14, 28, 56, \\ &112, 111, 109, 105, 97, 81, 49, 98, 83, 53, 106, 99, 85, 57\} \\ C\_3 &= \{3, 6, 12, 24, 48, 96, 79, 45, 90, 67, 21, 42, 84, \\ &55, 110, 107, 101, 89, 65, 17, 34, 68, 23, 46, 92, 71, 29, 58\} \\ C\_5 &= \{5, 10, 20, 40, 80, 47, 94, 75, 37, 74, 35, 70, 27, \\ &54, 108, 103, 93, 73, 33, 66, 19, 38, 76, 39, 78, 43, 86, 59\} \\ C\_7 &= \{9, 18, 36, 27, 21, 31, 62, 11, 22, 44, 88, 63, 13, 26, \\ &52, 104, 95, 77, 41, 82, 51, 102, 91, 69, 25, 50, 100, 87, 61\} \end{aligned}$$

Each coset apart from *C*<sup>0</sup> may be used to define 28 roots from a polynomial having binary coefficients and of degree 28. Alternatively, each cyclotomic coset may be used to define the non-zero coefficients of a polynomial, a minimum weight idempotent (see Sect. 4.4). Adding together any combination of the 5 minimum weight idempotents generates a cyclic code of length 113. Consequently, there are only 2<sup>5</sup> − 2 = 30 non-trivial, different cyclic codes of length 113 and some of these will be equivalent codes. Using Euclid's algorithm, it is easy to find the common factors of each idempotent combination and *x*<sup>113</sup> − 1. The resulting polynomial may be used as the generator polynomial, or the parity-check polynomial of the cyclic code.


**Table 4.2**

Prime factors of 2*m* −

1

For example, consider the GCD of *C*<sup>1</sup> + *C*<sup>3</sup> = *x* + *x*<sup>2</sup> + *x*<sup>3</sup> + *x*<sup>4</sup> + *x*<sup>6</sup> + *x*<sup>8</sup> + ... + *x*<sup>109</sup> + *x*<sup>110</sup> + *x*<sup>111</sup> + *x*<sup>112</sup> and *x*<sup>113</sup> − 1. This is the polynomial, *u*(*x*), which turns out to have degree 57

$$u(\mathbf{x}) = \mathbf{1} + \mathbf{x} + \mathbf{x}^2 + \mathbf{x}^3 + \mathbf{x}^5 + \mathbf{x}^6 + \mathbf{x}^7 + \mathbf{x}^{10} + \mathbf{x}^{13}$$

$$\dots + \mathbf{x}^{\mathfrak{S}1} + \mathbf{x}^{\mathfrak{S}2} + \mathbf{x}^{\mathfrak{S}4} + \mathbf{x}^{\mathfrak{S}3} + \mathbf{x}^{\mathfrak{S}6} + \mathbf{x}^{\mathfrak{S}7}.$$

Using *u*(*x*) as the parity-check polynomial of the cyclic code produces a (113, 57, 18) code. This is quite a good code as the very best (113, 57) code has a minimum Hamming distance of 19.

As another example of using this method for non-primitive cyclic code construction, consider the factors of 2<sup>39</sup> − 1 in Table 4.2. It will be seen that 79 is a factor and so a cyclic code of length 79 may be constructed from polynomials of degree 39. The cyclotomic cosets of 79 are as follows:

$$\begin{aligned} C\_0 &= \{0\} \\ C\_1 &= \{1, 2, 4, 8, 16, 32, 64, 49, 19, 38, 76, 73, \dots 20, 40\} \\ C\_3 &= \{3, 6, 12, 24, 48, 17, 34, 68, 57, 35, 70, \dots 60, 41\} \end{aligned}$$

The GCD of the idempotent sum given by the cyclotomic cosets *C*<sup>0</sup> +*C*<sup>1</sup> and *x*<sup>79</sup> −1 is the polynomial, *u*(*x*), of degree 40:

$$\mu(\mathbf{x}) = 1 + \mathbf{x} + \mathbf{x}^3 + \mathbf{x}^5 + \mathbf{x}^8 + \mathbf{x}^{11} + \mathbf{x}^{12} + \mathbf{x}^{16}$$

$$\dots + \mathbf{x}^{28} + \mathbf{x}^{29} + \mathbf{x}^{34} + \mathbf{x}^{36} + \mathbf{x}^{37} + \mathbf{x}^{40}.$$

Using *u*(*x*) as the parity-check polynomial of the cyclic code produces a (79, 40, 15) code. This is the quadratic residue cyclic code for the prime number 79 and is a best-known code.

In a further example Table 4.2 shows that 2<sup>37</sup>−1 has 223 as a factor. The GCD of the idempotent given by the cyclotomic coset*C*<sup>3</sup> *x*<sup>3</sup>+*x*<sup>6</sup>+*x*<sup>12</sup>+*x*<sup>24</sup>+*x*<sup>48</sup>+ ... +*x*<sup>198</sup>+*x*<sup>204</sup> and *x*<sup>223</sup> − 1 is the polynomial, *u*(*x*), of degree 111

$$u(\mathbf{x}) = 1 + \mathbf{x}^2 + \mathbf{x}^3 + \mathbf{x}^4 + \mathbf{x}^8 + \mathbf{x}^9 + \mathbf{x}^{10} + \mathbf{x}^{12}$$

$$\dots + \mathbf{x}^{92} + \mathbf{x}^{93} + \mathbf{x}^{95} + \mathbf{x}^{103} + \mathbf{x}^{107} + \mathbf{x}^{111}.$$

Using *u*(*x*) as the parity-check polynomial of the cyclic code produces a (223, 111, 32) cyclic code.

#### **4.5 Binary Cyclic Codes of Odd Lengths from 129 to 189**

Since many of the best-known codes are cyclic codes, it is useful to have a table of the best cyclic codes. The literature already contains tables of the best cyclic codes up to length 127 and so the following table starts at 129. All possible binary cyclic codes up to length 189 have been constructed and their minimum Hamming distance has been evaluated.

The highest minimum distance attainable by all binary cyclic codes of odd lengths 129 ≤ *n* ≤ 189 is tabulated in Table 4.3. The column "Roots of *g*(*x*)" in Table 4.3 denotes the exponents of roots of the generator polynomial *g*(*x*), excluding the conjugate roots. All cyclic codes with generator polynomials 1+*x* and (*x<sup>n</sup>* −1)/(1+*x*), since they are trivial codes, are excluded in Table 4.3 and since primes *n* = 8*m* ± 3 contain these trivial cyclic codes only, there is no entry in the table for these primes. The number of permutation inequivalent and non-degenerate cyclic codes, excluding the two trivial codes mentioned earlier, for each odd integer *n* is given by *N<sup>C</sup>* . The primitive polynomial *m*(*x*) defining the field is given in octal. Full details describing the derivation of Table 4.3 are provided in Sect. 5.3.

In Table 4.3, there is no cyclic code that improves the lower bound given by Brouwer [1], but there are 134 cyclic codes that meet this lower bound and these codes are printed in bold.

#### **4.6 Summary**

The important large family of binary cyclic codes has been explored in this chapter. Starting with cyclotomic cosets, the minimal polynomials were introduced. The Mattson–Solomon polynomial was described and it was shown to be an inverse discrete Fourier transform based on a primitive root of unity. The usefulness of the Mattson–Solomon polynomial in the design of cyclic codes was demonstrated. The relationship between idempotents and the Mattson–Solomon polynomial of a polynomial that has binary coefficients was described with examples given. It was shown how binary cyclic codes may be easily derived from idempotents and the cyclotomic cosets. In particular, a method was described based on cyclotomic cosets for the design of high-degree non-primitive binary cyclic codes. Code examples using the method were presented.

A table listing the complete set of the best binary cyclic codes, having the highest minimum Hamming distance, has been included for all code lengths from 129 to 189 bits.





**Table 4.3**

(continued)




**Table 4.3**

(continued)


**Table 4.3**



86 4 Cyclotomic Cosets, the Mattson–Solomon Polynomial …








4.6 Summary 91



92 4 Cyclotomic Cosets, the Mattson–Solomon Polynomial …


**Table 4.3**

(continued)





**Table 4.3**



**Table 4.3** (continued)


## **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the book's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the book's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Chapter 5 Good Binary Linear Codes**

## **5.1 Introduction**

Two of the important performance indicators for a linear code are the minimum Hamming distance and the weight distribution. Efficient algorithms for computing the minimum distance and weight distribution of linear codes are explored below. Using these methods, the minimum distances of all binary cyclic codes of length 129– 189 have been enumerated. The results are presented in Chap. 4. Many improvements to the database of best-known codes are described below. In addition, methods of combining known codes to produce good codes are explored in detail. These methods are applied to cyclic codes, and many new binary codes have been found and are given below.

The quest of achieving Shannon's limit for the AWGN channel has been approached in a number of different ways. Here we consider the problem formulated by Shannon of the construction of good codes which maximise the difference between the error rate performance for uncoded transmission and coded transmission. For uncoded, bipolar transmission with matched filtered reception, it is well known (see for example Proakis [20]) that the bit error rate, *pb*, is given by

$$p\_b = \frac{1}{2} \text{erfc}\left(\sqrt{\frac{E\_b}{N\_0}}\right). \tag{5.1}$$

Comparing this equation with the equation for the probability of error when using coding, viz. the probability of deciding on one codeword rather than another, Eq. (1.4) given in Chap. 1, it can be seen that the improvement due to coding, the coding gain is indicated by the term *dmin*. *<sup>k</sup> <sup>n</sup>* , the product of the minimum distance between codewords and the code rate. This is not the end of the story in calculating the overall probability of decoder error because this error probability needs to be multiplied by the number of codewords distance *dmin* apart.

For a linear binary code, the Hamming distance between two codewords is equal to the Hamming weight of the codeword formed by adding the two codewords together. Moreover, as the probability of decoder error at high *Eb <sup>N</sup>*<sup>0</sup> values depends on the minimum Hamming distance between codewords, for a linear binary code, the performance of the code depends on the minimum Hamming weight codewords of the code, the *dmin* of the code and the number of codewords with this weight (the multiplicity). For a given code rate ( *<sup>k</sup> <sup>n</sup>* ) and length *n*, the higher the weight of the minimum Hamming weight codewords of the code, the better the performance, assuming the multiplicity is not too high. It is for this reason that a great deal of research effort has been extended, around the world in determining codes with the highest minimum Hamming weight for a given code rate ( *<sup>k</sup> <sup>n</sup>* ) and length *n*. These codes are called the best-known codes with parameters (*n*, *k*, *d*), where *d* is understood to be the *dmin* of the code, and the codes are tabulated in a database available online [12] with sometimes a brief description or reference to their method of construction.<sup>1</sup>

In this approach, it is assumed that a decoding algorithm either exists or will be invented which realises the full performance of a best-known code. For binary codes of length less than 200 bits the Dorsch decoder described in Chap. 15 does realise the full performance of the code.

Computing the minimum Hamming weight codewords of a linear code is, in general, a Nondeterministic Polynomial-time (NP) hard problem, as conjectured by [2] and later proved by [24]. Nowadays, it is a common practice to use a multithreaded algorithm which runs on multiple parallel computers (grid computing) for minimum Hamming distance evaluation. Even then, it is not always possible to evaluate the exact minimum Hamming distance for large codes. For some algebraic codes, however, there are some shortcuts that make it possible to obtain the lower and upper bounds on this distance. But knowing these bounds are not sufficient as the whole idea is to know explicitly the exact minimum Hamming distance of a specific constructed code. As a consequence, algorithms for evaluating the minimum Hamming distance of a code are very important in this subject area and these are described in the following section.

It is worth mentioning that a more accurate benchmark of how good a code is, in fact its Hamming weight distribution. Whilst computing the minimum Hamming distance of a code is in general NP-hard, computing the Hamming weight distribution of a code is even more complex. In general, for two codes of the same length and dimension but of different minimum Hamming distance, we can be reasonably certain that the code with the higher distance is the superior code. Unless we are required to decide between two codes with the same parameters, including minimum Hamming distance, it is not necessary to go down the route of evaluating the Hamming weight distribution of both codes.

<sup>1</sup>Multiplicities are ignored in the compiling of the best, known code Tables with the result that sometimes the best, known code from the Tables is not the code that has the best performance.

# **5.2 Algorithms to Compute the Minimum Hamming Distance of Binary Linear Codes**

#### *5.2.1 The First Approach to Minimum Distance Evaluation*

For a [*n*, *<sup>k</sup>*, *<sup>d</sup>*] linear code over <sup>F</sup><sup>2</sup> with a reduced-echelon generator matrix *<sup>G</sup>sys* <sup>=</sup> [*I <sup>k</sup>* |*P*], where *I <sup>k</sup>* and *P* are *k* × *k* identity and *k* × (*n* − *k*) matrices respectively, a codeword of this linear code can be generated by taking a linear combination of some rows of *Gsys*. Since the minimum Hamming distance of a linear code is the minimum non-zero weight among all of the 2*<sup>k</sup>* codewords, a brute-force method to compute the minimum distance is to generate codewords by taking

$$
\binom{k}{1}, \binom{k}{2}, \binom{k}{3}, \dots, \binom{k}{k-1}, \text{ and } \binom{k}{k}
$$

linear combinations of the rows in *Gsys*, noting the weight of each codeword generated and returning the minimum weight codeword of all 2*<sup>k</sup>* −1 non-zero codewords. This method gives not only the minimum distance, but also the weight distribution of a code. It is obvious that as *k* grows larger this method becomes infeasible. However, if *n* − *k* is not too large, the minimum distance can still be obtained by evaluating the weight distribution of the [*n*, *n* − *k*, *d*- ] dual code and using the MacWilliams Identities to compute the weight distribution of the code. It should be noted that the whole weight distribution of the [*n*, *n* − *k*, *d*- ] dual code has to be obtained, not just the minimum distance of the dual code.

In direct codeword evaluation, it is clear that there are too many unnecessary codeword enumerations involved. A better approach which avoids enumerating large numbers of unnecessary codewords can be devised. Let

$$\mathfrak{c} = (\mathfrak{i}|\mathfrak{p}) = (c\_0, c\_1, \dots, c\_{k-1}|c\_k, \dots, c\_{n-2}, c\_{n-1})^\dagger$$

be a codeword of a binary linear code of minimum distance *d*. Let *c*- = (*i* - | *p*- ) be a codeword of weight *d*, then if wt*<sup>H</sup>* (*i* - ) = *w*for some integer*w* < *d*, wt*<sup>H</sup>* ( *p*- ) = *d*−*w*. This means that at most

$$\sum\_{\mathbf{w}=1}^{\min\{d-1,k\}} \binom{k}{\mathbf{w}} \tag{5.2}$$

codewords are required to be enumerated.

In practice, *d* is unknown and an upper bound *dub* on the minimum distance is required during the evaluation and the minimum Hamming weight found thus far can be used for this purpose. It is clear that once all *<sup>w</sup> w*- =1 *k w*- codewords of information weight *w*are enumerated,


Therefore, having an upper bound, a lower bound *dlb* = *w* + 1 on the minimum distance can also be obtained. The evaluation continues until the condition *dlb* ≥ *dub* is met and in this event, *dub* is the minimum Hamming distance.

#### *5.2.2 Brouwer's Algorithm for Linear Codes*

There is an apparent drawback of the above approach. In general, the minimum distance of a low-rate linear code is greater than its dimension. This implies that *<sup>k</sup> w*=1 *k w* codewords would need to be enumerated. A more efficient algorithm was attributed to Brouwer<sup>2</sup> and the idea behind this approach is to use a collection of generator matrices of mutually disjoint information sets [11].

**Definition 5.1** (*Information Set*) Let the set *S* = {0, 1, 2,..., *n*−} be the coordinates of an [*n*, *k*, *d*] binary linear code with generator matrix *G*. The set *I* ⊆ *S* of *k* elements is an information set if the corresponding coordinates in the generator matrix is linearly independent and the submatrix corresponding to the coordinates in *I* has rank *k*, hence, it can be transformed into a *k* × *k* identity matrix.

In other words, we can say, in relation to a codeword, the *k* symbols user message is contained at the coordinates specified by *I* and the redundant symbols are stored in the remaining *n* − *k* positions.

An information set corresponds to a reduced-echelon generator matrix and it may be obtained as follows. Starting with a reduced-echelon generator matrix *G*(1) *sys* = *Gsys* = [*I <sup>k</sup>* |*P*], Gaussian elimination is applied to submatrix *P* so that it is transformed to reduced-echelon form.

The resulting generator matrix now becomes *G*(2) *sys* = [*A*|*I <sup>k</sup>* |*P*- ], where *P* is a *k* × (*n* − 2*k*) matrix. Next, submatrix *P* is put into reduced-echelon form and the process continues until there exists a *k* × (*n* − *lk*) submatrix of rank less than *k*, for some integer *l*. Note that column permutations may be necessary during the transformation to maximise the number of disjoint information sets.

Let *G* be a collection of *m* reduced-echelon generator matrices of disjoint information sets, *G* = *G*(1) *sys*, *G*(2) *sys*,..., *G*(*m*) *sys* .

Using these *m* matrices means that after *<sup>w</sup> w*- =1 *k w*- enumerations


We can see that the lower bound has been increased by a factor of *m*, instead of 1 compared to the previous approach. For *<sup>w</sup>* <sup>≤</sup> *<sup>k</sup>*/2, we know that *<sup>k</sup> w <sup>k</sup> w*−1 and this lower bound increment reduces the bulk of computations significantly.

If *d* is the minimum distance of the code, the total number of enumerations required is given by

<sup>2</sup>Zimmermann attributed this algorithm to Brouwer in [25].

$$\sum\_{m=1}^{\min\{\lceil d/m \rceil - 1, k\}} m \binom{k}{w} . \tag{5.3}$$

*Example 5.1* (*Disjoint Information Sets*) Consider the [55, 15, 20]<sup>2</sup> optimal binary linear, a shortened code of the Goppa code discovered by [15]. The reduced-echelon generator matrices of disjoint information sets are given by

Brouwer's algorithm requires 9948 codewords to be evaluated to prove the minimum distance of this code is 20. In contrast, for the same proof, 32767 codewords would need to be evaluated if only one generator matrix is employed.

,

.

.

# *5.2.3 Zimmermann's Algorithm for Linear Codes and Some Improvements*

A further refinement to the minimum distance algorithm is due to Zimmermann [25]. Similar to Brouwer's approach, a set of reduced-echelon generator matrices are required. While in Brouwer's approach the procedure is stopped once a nonfull-rank submatrix is reached; Zimmermann's approach proceeds further to obtain submatrices with overlapping information sets. Let *G*(*m*) *sys* = [*Am*|*I <sup>k</sup>* |*Bm*+<sup>1</sup>] be the last generator matrix which contains a disjoint information set. To obtain matrices with overlapping information sets, Gaussian elimination is performed on the submatrix *Bm*+<sup>1</sup> and this yields

$$\mathbf{G}\_{\mathrm{sys}}^{(m+1)} = \left[ \hat{A}\_m \left| \frac{\mathbf{0}}{I\_{k-r\_{m+1}}} \right| \frac{I\_{r\_{m+1}}}{\mathbf{0}} \middle| \mathbf{B}\_{m+2} \right],$$

where *rm*+<sup>1</sup> <sup>=</sup> Rank (*Bm*+<sup>1</sup>). Next, *<sup>G</sup>*(*m*+2) *sys* is produced by carrying out Gaussian elimination on the submatrix *Bm*+<sup>2</sup> and so on.

From *G*(3) *sys* of Example 5.1, we can see that the last 10 coordinates do not form an information set since the rank of this submatrix is clearly less than *k*. Nonetheless, a "partial" reduced-echelon generator matrix can be obtained from *G*(3) *sys*,

From *G*(4) *sys*, we can see that the last *k* columns is also an information set, but *k* − Rank *G*(4) *sys* coordinates of which overlap with those in *G*(3) *sys*. The generator matrix *G*(4) *sys* then may be used to enumerate codewords with condition that the effect of overlapping information set has to be taken into account.

Assuming that all codewords with information weight ≤ *w* have been enumerated, we know that

• for all *<sup>G</sup>*(*i*) *sys* of full-rank, say there are *m* of these matrices, all cases of *d* ≤ *mw* have been considered and each contributes to the lower bound.

As a result, the lower bound becomes *dlb* = *m*(*w* + 1).

• for each *<sup>G</sup>*(*i*) *sys* that do not have full-rank, we can join *G*(*i*) *sys* with column subsets of *G*(*j*) *sys*, for *j* < *i*, so that we have an information set *I<sup>i</sup>* , which of course overlaps with information set *I<sup>j</sup>* . Therefore, for all of these matrices, say *M*, all cases of *d* ≤ *Mw* have been considered, but some of which are attributed to other information sets, and considering these would result in double counting. According to Zimmermann [25], for each matrix *G*(*m*<sup>+</sup> *<sup>j</sup>*) *sys* with an overlapping information set unless *w* ≥ *k* −Rank *B<sup>m</sup>*<sup>+</sup> *<sup>j</sup>* for which the lower bound becomes *dlb* = *dlb* + *w* − *k* − Rank *B<sup>m</sup>*<sup>+</sup> *<sup>j</sup>* + 1 , there is no contribution to the lower bound.

Let the collection of full rank-reduced echelon matrices be denoted by, as before, *G* = *G*(1) *sys*, *G*(2) *sys*,..., *G*(*m*) *sys* , and let *G* denote the collection of *M* rank matrices with overlapping information sets *G* - = *G*(*m*+1) *sys* , *G*(*m*+2) *sys* ,..., *G*(*m*+*M*) *sys* . All *m* + *M* generator matrices are needed by the [25] algorithm. Clearly, if the condition *w* ≥ *k* − Rank *Bm*<sup>+</sup> *<sup>j</sup>* is never satisfied throughout the enumeration, the corresponding generator matrix contributes nothing to the lower bound and, hence, can be excluded [11]. In order to accommodate this improvement, we need to know *wmax* the maximum information weight that would need to be enumerated before the minimum distance is found. This can be accomplished as follows: Say at information weight*w*, a lower weight codeword is found, i.e. new *dub*, starting from *w*- = *w*, we let *X* = *G* - , set *dlb* = *m*(*w*- +1) and then increment it by (*w*- −(*k* −Rank(*Bm*<sup>+</sup> *<sup>j</sup>*))+1) for each matrix in *G* that satisfies *w*- ≥ *k* − Rank *Bm*<sup>+</sup> *<sup>j</sup>* . Each matrix that satisfies this condition is also excluded from *X* . The weight *w* is incremented, *dlb* is recomputed and at the point when *dlb* ≥ *dub*, we have *wmax* and all matrices in *X* are those to be excluded from codeword enumeration.

In some cases, it has been observed that while enumerating codewords of information weight *w*, a codeword, whose weight coincides with the lower bound obtained at enumeration step *w*−1, appears. Clearly, this implies that the newly found codeword is indeed a minimum weight codeword; any other codeword of lower weight, if they exist, would have been found in the earlier enumeration steps. This suggests that the enumeration at step *w* may be terminated immediately. Since the bulk of computation time increases exponentially as the information weight is increased, this termination may result in a considerable saving of time.

Without loss of generality, we can assume that Rank(*Bm*<sup>+</sup> *<sup>j</sup>*) > Rank(*Bm*<sup>+</sup> *<sup>j</sup>*+<sup>1</sup>). With this consideration, we can implement the Zimmermann approach to minimum distance evaluation of linear code over F2–with the improvements, in Algorithm 5.1. The procedure to update *wmax* and *X* is given in Algorithm 5.2.

If there is additional code structure, the computation time required by Algorithm 5.1 can be reduced. For example, in some cases it is known that the binary code considered has even weight codewords only, then at the end of codeword enumeration at each step, the lower bound *dlb* that we obtained may be rounded down to the next multiple of 2. Similarly, for codes where the weight of every codeword is divisible by 4, the lower bound may be rounded down to the next multiple of 4.

#### *5.2.4 Chen's Algorithm for Cyclic Codes*

Binary cyclic codes, which were introduced by Prange [19], form an important class of block codes over F2. Cyclic codes constitute many well-known error-

**Algorithm 5.1** Minimum distance algorithm: improved Zimmermann's approach

```
Input: G =

          G(1) sys, G(2) sys,..., G(m) sys 
                            where |G | = m
Input: G -
        =

           G(m+1) sys , G(m+2) sys ,..., G(m+M) sys 
                                   where |G -

                                           | = M
Output: d (minimum distance)
 1: d-
     ← dub ← wmax ← k
 2: dlb ← w ← 1
 3: X = ∅
 4: repeat
 5: M ← M − |X |
 6: for all i ∈ Fk
               2 where wtH (i) = w do
 7: for 1 ≤ j ≤ m do
 8: d-
              ← wtH (i · G(j) sys)
 9: if d-
               < dub then
10: dub ← d-

11: if dub ≤ dlb then
12: Goto Step 36
13: end if
14: wmax , X ← Updatewmax and X 	
                                         dub, k, m, G -

15: end if
16: end for
17: for 1 ≤ j ≤ M do
18: d-
              ← wtH (i · G(m+ j) sys )
19: if d-
               < dub then
20: dub ← d-

21: if dub ≤ dlb then
22: Goto Step 36
23: end if
24: wmax , X ← Updatewmax and X 	
                                         dub, k, m, G -

25: end if
26: end for
27: end for
28: dlb ← m(w + 1)
29: for 1 ≤ j ≤ M do
30: if w ≥ 
               k − Rank 	
                       Bm+ j

                              then
31: dlb = dlb + 
                      w − 	
                          k − Rank 	
                                  Bm+ j

                                         + 1
                                            
32: end if
33: end for
34: w ← w + 1
35: until dlb ≥ dub OR w > k
36: d ⇐ dub
```
correcting codes, such as the quadratic-residue codes and the commonly used in practice Bose–Chaudhuri–Hocquenghem (BCH) and Reed–Solomon (RS) codes. A binary cyclic code of length *n*, where *n* is necessarily odd, has the property that if *<sup>c</sup>*(*x*) <sup>=</sup> *<sup>n</sup>*−<sup>1</sup> *<sup>i</sup>*=<sup>0</sup> *ci <sup>x</sup><sup>i</sup>* , where *ci* <sup>∈</sup> <sup>F</sup><sup>2</sup> is a codeword of the cyclic code, then *<sup>x</sup> <sup>j</sup> c*(*x*) (mod *x <sup>n</sup>* −1), for some integer *j*, is also a codeword of that cyclic code. That is to say that the automorphism group of a cyclic code contains the coordinate permutation *i* → *i* + 1 (mod *n*).

**Algorithm 5.2** *wmax* , *X* = Update *wmax* and *X dub*, *k*, *m*, *G* - 

```
Input: dub, k, m
Input: G -

          G(m+1) sys , G(m+2) sys ,..., G(m+M) sys 
Output: wmax and X
 1: X ← G -

 2: wmax ← 1
 3: repeat
 4: dlb ← m(wmax + 1)
 5: for 1 ≤ j ≤ |G -

                     | do
 6: if wmax ≥ 
                     k − Rank 	
                               Bm+ j

                                      then
 7: Remove G(m+ j) sys from X if G(m+ j) sys ∈ X
 8: dlb = dlb + 
                          wmax − 	
                                  k − Rank 	
                                            Bm+ j

                                                   + 1
                                                       
 9: end if
10: end for
11: wmax ← wmax + 1
12: until dlb ≥ dub OR wmax > k
13: return wmax and X
```
An [*n*, *k*, *d*] binary cyclic code is defined by a generator polynomial *g*(*x*) of degree *n* − *k*, and a parity-check polynomial *h*(*x*) of degree *k*, such that *g*(*x*)*h*(*x*) = 0 (mod *x <sup>n</sup>* −1). Any codeword of this cyclic code is a multiple of *g*(*x*), that is *c*(*x*) = *u*(*x*)*g*(*x*), where *u*(*x*) is any polynomial of degree less than *k*. The generator matrix *G* can be simply formed from the cyclic shifts of *g*(*x*), i.e.

$$\mathbf{G} = \begin{bmatrix} \mathbf{g}(\mathbf{x}) \pmod{\boldsymbol{x}^n - 1} \\ \arg(\mathbf{x}) \pmod{\boldsymbol{x}^n - 1} \\ \vdots \\ \mathbf{x}^{k-1}\mathbf{g}(\mathbf{x}) \pmod{\boldsymbol{x}^n - 1} \end{bmatrix} . \tag{5.4}$$

Since for some integer *i*, *x<sup>i</sup>* = *qi*(*x*)*g*(*x*)+*ri*(*x*) where *ri*(*x*) = *x<sup>i</sup>* (mod *g*(*x*)), we can write

$$
\pi^k \left( \mathfrak{x}^{n-k+i} - r\_{n-k+i}(\mathfrak{x}) \right) = \mathfrak{x}^k q\_i(\mathfrak{x}) \mathfrak{g}(\mathfrak{x}),
$$

and based on this, a reduced-echelon generator matrix *Gsys* of a cyclic code is obtained:

$$\mathbf{G}\_{\mathbf{y};\mathbf{z}} = \begin{bmatrix} \\ \\ \\ \\ \\ \\ \\ \end{bmatrix} \begin{array}{c|cc} -\mathbf{x}^{n-k} \pmod{\mathbf{g}(\mathbf{x})} \\ -\mathbf{x}^{n-k+1} \pmod{\mathbf{g}(\mathbf{x})} \\ -\mathbf{x}^{n-k+2} \pmod{\mathbf{g}(\mathbf{x})} \\ \vdots \\ -\mathbf{x}^{n-1} \pmod{\mathbf{g}(\mathbf{x})} \end{array} . \end{array} \tag{5.5}$$

The matrix *Gsys* in (5.5) may contain several mutually disjoint information sets. But because each codeword is invariant under a cyclic shift, a codeword generated by information set *I<sup>i</sup>* can be obtained from information set *I<sup>j</sup>* by means of a simple cyclic shift. For an [*n*, *k*, *d*] cyclic code, there always exists *n*/*k* mutually disjoint information sets. As a consequence of this, using a single information set is sufficient to improve the lower bound to *n*/*k*(*w* + 1) at the end of enumeration step *w*. However, Chen [7] showed that this lower bound could be further improved by noting that the average number of non-zeros of a weight *w*<sup>0</sup> codeword in an information set is *w*0*k*/*n*. After enumerating *<sup>w</sup> i*=1 *k i* codewords, we know that the weight of a codeword restricted to the coordinates specified by an information set is at least *w* +1. Relating this to the average weight of codeword in an information set, we have an improved lower bound of *dlb* = (*w*+1)*n*/*k*. Algorithm 5.3 summarises Chen's [7] approach to minimum distance evaluation of a binary cyclic code. Note that Algorithm 5.3 takes into account the early termination condition suggested in Sect. 5.2.3.

**Algorithm 5.3** Minimum distance algorithm for cyclic codes: Chen's approach

```
Input: Gsys = [I k |P] {see (5.5)}
Output: d (minimum distance)
 1: dub ← k
 2: dlb ← 1
 3: w ← 1
 4: repeat
 5: d-
        ← k
 6: for all i ∈ Fk
                2 where wtH (i) = w do
 7: d-
           ← wtH (i · Gsys)
 8: if d-
             < dub then
 9: dub ← d-

10: if dub ≤ dlb then
11: Goto Step 18
12: end if
13: end if
14: end for
15: dlb ←
             n
             k
              (w + 1)

16: w ← w + 1
17: until dlb ≥ dub OR w > k
18: d ⇐ dub
```
It is worth noting that both minimum distance evaluation algorithms of Zimmermann [25] for linear codes and that of Chen [7] for cyclic codes may be used to compute the number of codewords of a given weight. In evaluating the minimum distance *d*, we stop the algorithm after enumerating all codewords having information weight *i* to *w*, where *w* is the smallest integer at which the condition *dlb* ≥ *d* is reached. To compute the number of codewords of weight *d*, in addition to enumerating all codewords of weight *i* to *w* in their information set, all codewords having weight *w*+1 in their information set, also need to be enumerated. For Zimmermann's method, we use all of the available information sets, including those that overlap, and store all codewords whose weight matches *d*. In contrast to Chen's algorithm, we use only a single information set but for each codeword of weight *d* found, we accumulate this codeword and all of the *n* − 1 cyclic shifts. In both approaches, it is necessary to remove the doubly-counted codewords at the end of the enumeration stage.

## *5.2.5 Codeword Enumeration Algorithm*

The core of all minimum distance evaluation and codeword counting algorithms lies in the codeword enumeration. Given a reduced-echelon generator matrix, codewords can be enumerated by taking linear combinations of the rows in the generator matrix. This suggests the need for an efficient algorithm to generate combinations.

One of the most efficient algorithm for this purpose is the revolving-door algorithm, see [4, 13, 17]. The efficiency of the revolving-door algorithm arises from the property that in going from one combination pattern to the next, there is only one element that is exchanged. An efficient implementation of the revolving-door algorithm is given in [13], called *Algorithm R*, which is attributed to [18].<sup>3</sup>

In many cases, using a single-threaded program to either compute the minimum distance, or count the number of codewords of a given weight, of a linear code may take a considerable amount of computer time and can take several weeks.

For these long codes, we may resort to a multi-threaded approach by splitting the codeword enumeration task between multiple computers. The revolving-door algorithm has a nice property that allows such splitting to be neatly realised. Let *atat*−<sup>1</sup> ... *a*2*a*1, where *at* > *at*−<sup>1</sup> > ... > *a*<sup>2</sup> > *a*<sup>1</sup> be a pattern of an *t* out of *s* combinations–*C<sup>s</sup> <sup>t</sup>* . A pattern is said to have rank *i* if this pattern appears as the (*i* + 1)th element in the list of all *C<sup>s</sup> <sup>t</sup>* combinations.<sup>4</sup> Let Rank(*atat*−<sup>1</sup> ... *a*2*a*1) be the rank of pattern *atat*−<sup>1</sup> ... *a*2*a*1, the revolving-door algorithm has the property that [13]

$$\text{Rank}(a\_t a\_{t-1} \dots a\_2 a\_1) = \left[ \binom{a\_t + 1}{t} - 1 \right] - \text{Rank}(a\_{t-1} \dots a\_2 a\_1) \tag{5.6}$$

and, for each integer *<sup>N</sup>*, where 0 <sup>≤</sup> *<sup>N</sup>* <sup>≤</sup> *<sup>s</sup> t* − 1, we can represent it uniquely with an ordered pattern *atat*−<sup>1</sup> ... *a*2*a*1. As an implication of this and (5.6), if all *k t* codewords need to be enumerated, we can split the enumeration into *<sup>k</sup> t* /*M* blocks, where in each block only at most *M* codewords need to be generated. In

<sup>3</sup>This is the version that the authors implemented to compute the minimum distance and to count the number of codewords of a given weight of a binary linear code.

<sup>4</sup>Here it is assume that the first element in the list of all *C<sup>s</sup> <sup>t</sup>* combinations has rank 0.


**Fig. 5.1** *C*<sup>6</sup> <sup>4</sup> and *<sup>C</sup>*<sup>7</sup> <sup>5</sup> revolving-door combination patterns

this way, we can do the enumeration of each block on a separate computer and this allows a parallelisation of the minimum distance evaluation, as well as the counting of the number of codewords of a given weight. We know that at the *i*th block, the enumeration would start from rank (*i* − 1)*M*, and the corresponding pattern can be easily obtained following (5.6) and Lemma 5.1 below.

All *atat*−<sup>1</sup> ... *a*2*a*<sup>1</sup> revolving-door patterns of *C<sup>s</sup> <sup>t</sup>* satisfy the property that if the values in position *at* grow in an increasing order, then for fixed *at* , the values in position *at*−<sup>1</sup> grow in a decreasing order, moreover for fixed *atat*−<sup>1</sup> the values in position *at*−<sup>2</sup> grow in an increasing order, and so on in an alternating order. This behaviour is evident by observing all revolving-door patterns of *C*<sup>6</sup> <sup>4</sup> (left) and *C*<sup>7</sup> 5 (right) shown in Fig. 5.1.

From this figure, we can also observe that

$$C\_t^s \supset C\_t^{s-1} \supset \dots \supset C\_t^{t+1} \supset C\_t^t,\tag{5.7}$$

and this suggests the following lemma.

**Lemma 5.1** (Maximum and Minimum Ranks) *Consider the atat*−<sup>1</sup> ... *a*2*a*<sup>1</sup> *revolving-door combination pattern, if we consider patterns with fixed at , the maximum and minimum ranks of such pattern are respectively given by*

$$
\binom{a\_t+1}{t} - 1 \qquad \qquad \text{and} \qquad \binom{a\_t}{t} \cdot 1
$$

*Example 5.2* (*Maximum and Minimum Ranks*) Say, if we consider all *C*<sup>6</sup> <sup>4</sup> revolvingdoor combination patterns (left portion of Fig. 5.1) where *a*<sup>4</sup> = 4. From Lemma 5.1, we have a maximum rank of <sup>5</sup> 4 <sup>−</sup> <sup>1</sup> <sup>=</sup> 4, and a minimum rank of <sup>4</sup> 4 = 1. We can see that these rank values are correct from Fig. 5.1.

*Example 5.3* (*The Revolving-Door Algorithm*) Consider combinations*C*<sup>7</sup> <sup>5</sup> generated by the revolving-door algorithm, we would like to determine the rank of combination pattern 17. We know that the combination pattern takes the ordered form of *a*5*a*4*a*3*a*2*a*1, where *ai* > *ai*−1. Starting from *a*5, which can takes values from 0 to 6, we need to find *a*<sup>5</sup> such that the inequality *<sup>a</sup>*<sup>5</sup> 5 <sup>≤</sup> <sup>17</sup> <sup>≤</sup> *<sup>a</sup>*5+<sup>1</sup> 5 − 1 is satisfied (Lemma 5.1). It follows that *a*<sup>5</sup> = 6 and using (5.6), we have

$$\begin{aligned} 17 &= \text{Rank}(6a\_4a\_3a\_2a\_1) \\ &= \left[ \binom{6+1}{5} - 1 \right] - \text{Rank}(a\_4a\_3a\_2a\_1) \\ \text{Rank}(a\_4a\_3a\_2a\_1) &= 20 - 17 = 3 \end{aligned}$$

Next, we consider *a*<sup>4</sup> and as before, we need to find *a*<sup>4</sup> ∈ {5, 4, 3, 2, 1, 0} such that the inequality *<sup>a</sup>*<sup>4</sup> 4 <sup>≤</sup> Rank(*a*4*a*3*a*2*a*1) <sup>≤</sup> *<sup>a</sup>*4+<sup>1</sup> 4 − 1 is satisfied. It follows that *a*<sup>4</sup> = 4 and from (5.6), we have

$$\begin{aligned} 3 &= \text{Rank}\left(4a\_3a\_2a\_1\right) \\ &= \left[\binom{4+1}{4} - 1\right] - \text{Rank}\left(a\_3a\_2a\_1\right) \\ \text{Rank}\left(a\_3a\_2a\_1\right) &= 4 - 3 = 1 \end{aligned}$$

Next, we need to find *a*3, which can only take a value less than 4, such that the inequality *<sup>a</sup>*<sup>3</sup> 3 <sup>≤</sup> Rank(*a*3*a*2*a*1) <sup>≤</sup> *<sup>a</sup>*3+<sup>1</sup> 3 − 1 is satisfied. It follows that *a*<sup>3</sup> = 3 and from (5.6), Rank(*a*2*a*1) = <sup>3</sup>+<sup>1</sup> 3 − 1 − 1 = 2.

So far we have 643*a*2*a*1, only *a*<sup>2</sup> and *a*<sup>1</sup> are unknown. Since *a*<sup>3</sup> = 3, *a*<sup>2</sup> can only take a value less than 3. The inequality *<sup>a</sup>*<sup>2</sup> 2 <sup>≤</sup> Rank(*a*2*a*1) <sup>≤</sup> *<sup>a</sup>*2+<sup>1</sup> 2 − 1 is satisfied if *a*<sup>2</sup> = 2 and correspondingly, Rank(*a*1) = <sup>2</sup>+<sup>1</sup> 2 − 1 − 2 = 0.

For the last case, the inequality *<sup>a</sup>*<sup>1</sup> 1 <sup>≤</sup> Rank(*a*1) <sup>≤</sup> *<sup>a</sup>*1+<sup>1</sup> 1 − 1 is true if and only if *a*<sup>1</sup> = 0. Thus, we have 64320 as the rank 17 *C*<sup>7</sup> <sup>5</sup> revolving-door pattern. Cross-checking this with Fig. 5.1, we can see that 64320 is indeed of rank 17.

From (5.6) and Example 5.3, we can see that given a rank *N*, where 0 ≤ *N* ≤ *s t* − 1, we can construct an ordered pattern of *C<sup>s</sup> <sup>t</sup>* revolving-door combinations *atat*−<sup>1</sup> ... *a*2*a*1, recursively. A software realisation of this recursive approach is given in Algorithm 5.4.

## **Algorithm 5.4** Recursively Compute *ai* (Rank(*aiai*−<sup>1</sup> ... *a*2*a*1),*i*)

**Input:** *i* and Rank(*ai ai*−<sup>1</sup> ... *a*2*a*1) **Output:** *ai* 1: Find *ai* , where 0 <sup>≤</sup> *ai* <sup>&</sup>lt; *ai*+1, such that *ai i* ≤ Rank(*ai ai*−<sup>1</sup> ... *a*2*a*1) ≤ *ai*+<sup>1</sup> *i* − 1 2: **if** *i* > *i* **then** 3: Compute Rank(*ai*−<sup>1</sup> ... *a*2*a*1) = *ai*+<sup>1</sup> *i* − 1 − Rank(*ai ai*−<sup>1</sup> ... *a*2*a*1) 4: RecursiveCompute *ai* (Rank(*ai*−<sup>1</sup> ... *a*2*a*1),*i* − 1) 5: **end if** 6: **return** *ai*

# **5.3 Binary Cyclic Codes of Lengths 129 ≤** *n* **≤ 189**

The minimum distance of all binary cyclic codes of lengths less than or equal to 99 has been determined by Chen [7, 8] and Promhouse et al. [21].

This was later extended to longer codes with the evaluation of the minimum distance of binary cyclic codes of lengths from 101 to 127 by Schomaker et al. [22]. We extend this work to include all cyclic codes of odd lengths from 129 to 189 in this book. The aim was to produce a Table of codes as a reference source for the highest minimum distance, with the corresponding roots of the generator polynomial, attainable by all cyclic codes over F<sup>2</sup> of odd lengths from 129 to 189. It is well known that the coordinate permutation σ : *i* → μ*i*, where μ is an integer relatively prime to *n*, produces equivalent cyclic codes [3, p. 141f]. With respect to this property, we construct a list of generator polynomials *g*(*x*) of all inequivalent and non-degenerate [16, p. 223f] cyclic codes of 129 ≤ *n* ≤ 189 by taking products of the irreducible factors of *x <sup>n</sup>* − 1. Two trivial cases are excluded, namely *g*(*x*) = *x* + 1 and *g*(*x*) = (*x <sup>n</sup>* − 1)/(*x* + 1), since these codes have trivial minimum distance and exist for any odd integer *n*. The idea is for each *g*(*x*) of cyclic codes of odd length *n*; the systematic generator matrix is formed and the minimum distance of the code is determined using Chen's algorithm (Algorithm 5.3). However, due to the large number of cyclic codes and the fact that we are only interested in those of largest minimum distance for given *n* and *k*, we include a threshold distance *dth* in Algorithm 5.3. Say, for given *n* and *k*, we have a list of generator polynomials *g*(*x*) of all inequivalent cyclic codes. Starting from the top of the list, the minimum distance of the corresponding cyclic code is evaluated. If a codeword of weight less than or equal to *dth* is found during the enumeration, the computation is terminated immediately and the next *g*(*x*) is then processed. The threshold *dth*, which is initialised with 0, is updated with the largest minimum distance found so far for given *n* and *k*.

Table 4.3 in Sect. 4.5 shows the highest attainable minimum distance of all binary cyclic codes of odd lengths from 129 to 189. The number of inequivalent and nondegenerate cyclic codes for a given odd integer *n*, excluding the two trivial cases mentioned above, is denoted by *N<sup>C</sup>* .

Note that Table 4.3 does not contain entries for primes *n* = 8*m*±3. This is because for these primes, 2 is not a quadratic residue modulo *n* and hence, ord2(*n*) = *n* − 1. As a consequence, *x <sup>n</sup>* −1 factors into two irreducible polynomials only, namely *x* +1 and (*x <sup>n</sup>* − 1)/(*x* + 1) which generate trivial codes. Let β be a primitive *n*th root of unity, the roots of *g*(*x*) of a cyclic code (excluding the conjugate roots) are given in terms of the exponents of β. The polynomial *m*(*x*) is the minimal polynomial of β and it is represented in octal format with most significant bit on the left. That is, *m*(*x*) = 166761, as in the case for *n* = 151, represents *x* <sup>15</sup> + *x* <sup>14</sup> + *x* <sup>13</sup> + *x* <sup>11</sup> + *x* <sup>10</sup> + *x* <sup>8</sup> + *x* <sup>7</sup> + *x* <sup>6</sup> + *x* <sup>5</sup> + *x* <sup>4</sup> + 1.

# **5.4 Some New Binary Cyclic Codes Having Large Minimum Distance**

Constructing an [*n*, *k*] linear code possessing the largest minimum distance is one of the main problems in coding theory. There exists a database containing the lower and upper bounds of minimum distance of binary linear codes of lengths 1 ≤ *n* ≤ 256. This database appears in [6] and the updated version is accessible online.<sup>5</sup>

The lower bound corresponds to the largest minimum distance for a given [*n*, *k*]*<sup>q</sup>* code that has been found to date. Constructing codes which improves Brouwer's lower bounds is an on-going research activity in coding theory. Recently, Tables of lower- and upper-bounds of not only codes over finite-fields, but also quantum errorcorrecting codes, have been published by Grassl [12]. These bounds for codes over finite-fields, which are derived from MAGMA [5], appear to be more up-to-date than those of Brouwer.

We have presented in Sect. 5.3, the highest minimum distance attainable by all binary cyclic codes of odd lengths from 129 to 189 and found none of these cyclic codes has larger minimum distance than the corresponding Brouwer's lower bound for the same *n* and *k*. The next step is to consider longer length cyclic codes, 191 ≤

<sup>5</sup>The database is available at http://www.win.tue.nl/~aeb/voorlincod.html.

Note that, since 12th March 2007, A. Brouwer has stopped maintaining his database and hence it is no longer accessible. This database is now superseded by the one maintained by Grassl [12].

*n* ≤ 255. For these lengths, unfortunately, we have not been able to repeat the exhaustive approach of Sect. 5.3 in a reasonable amount of time. This is due to the computation time to determine the minimum distance of these cyclic codes and also, for some lengths (e.g. 195 and 255), there are a tremendous number of inequivalent cyclic codes. Having said that, we can still search for improvements from lower rate cyclic codes of these lengths for which the minimum distance computation can be completed in a reasonable time. We have found many new cyclic codes that improve Brouwer's lower bound and before we present these codes, we should first consider the evaluation procedure.

As before, let β be a primitive *n*th root of unity and let Λ be a set containing all distinct (excluding the conjugates) exponents of β. The polynomial *x <sup>n</sup>* − 1 can be factorised into irreducible polynomials *fi*(*x*) over <sup>F</sup>2, *<sup>x</sup> <sup>n</sup>* <sup>−</sup> <sup>1</sup> <sup>=</sup> *<sup>i</sup>*∈<sup>Λ</sup> *fi*(*x*). For notational purposes, we denote the irreducible polynomial *fi*(*x*) as the minimal polynomial of β*<sup>i</sup>* . The generator and parity-check polynomials, denoted by *g*(*x*) and *h*(*x*) respectively, are products of *fi*(*x*). Given a set Γ ⊆ Λ, a cyclic code *C* which has β*<sup>i</sup>* , *i* ∈ Γ , as the non-zeros can be constructed. This means the parity-check polynomial *h*(*x*) is given by

$$h(\mathbf{x}) = \prod\_{i \in \Gamma} f\_i(\mathbf{x})$$

and the dimension *k* of this cyclic code is *<sup>i</sup>*∈<sup>Γ</sup> deg( *fi*(*x*)), where deg( *f* (*x*)) denotes the degree of *f* (*x*). Let Γ - ⊆ Λ\ {0}, *h*- (*x*) = *i*∈Γ *fi*(*x*) and *h*(*x*) = (1+*x*)*h*- (*x*). Given *C* with parity-check polynomial *h*(*x*), there exists an [*n*, *k* −1, *d*- ] expurgated cyclic code, *C* - , which has parity-check polynomial *h*- (*x*). For this cyclic code, wt*<sup>H</sup>* (*c*) ≡ 0 (mod 2) for all *c* ∈ *C* - . For convenience, we call *C* the augmented code of *C* - .

Consider an [*n*, *k* − 1, *d*- ] expurgated cyclic code *C* - , let the set *Γ* = {Γ1, Γ2, ...,Γ*r*} where, for 1 ≤ *j* ≤ *r*, Γ*<sup>j</sup>* ⊆ Λ \ {0} and *<sup>i</sup>*∈Γ*<sup>j</sup>* deg( *fi*(*x*)) = *k* − 1. For each Γ*<sup>j</sup>* ∈ *Γ* , we compute *h*- (*x*) and construct *C* - . Having constructed the expurgated code, the augmented code can be easily obtained as shown below. Let *G* be a generator matrix of the augmented code *C* , and without loss of generality, it can be written as

$$\mathbf{G} = \begin{array}{c|c} \mathbf{G}' \\ \hline \\ \hline \mathbf{v} \end{array} \tag{5.8}$$

where *G* is a generator matrix of *C* and the vector **v** is a coset of *C* in *C* . Using the arrangement in (5.8), we evaluate *d* by enumerating codewords *c* ∈ *C* from *G*- . The minimum distance of *C* , denoted by *d*, is simply min*<sup>c</sup>*∈*C*-{*d*- , wt*<sup>H</sup>* (*c* + **v**)} for all codewords *c* enumerated. We follow Algorithm 5.3 to evaluate *d*- . Let *dBr ouwer*

and *d*- *Br ouwer* denote the lower bounds of [6] for linear codes of the same length and dimension as those of *C* and *C* respectively. During the enumerations, as soon as *d* ≤ *dBr ouwer* and *d*- ≤ *d*- *Br ouwer*, the evaluation is terminated and the next Γ*<sup>j</sup>* in *Γ* is then processed. However, if *d* ≤ *dBr ouwer* and *d*- > *d*- *Br ouwer*, only the evaluation for *C* is discarded. Nothing is discarded if both *d*- > *d*- *Br ouwer* and *d* > *dBr ouwer*. This procedure continues until an improvement is obtained; or the set in *Γ* has been exhausted, which means that there does not exist [*n*, *k* − 1] and [*n*, *k*] cyclic codes which are improvements to Brouwer's lower bounds. In cases where the minimum distance computation is not feasible using a single computer, we switch to a parallel version using grid computers.

Table 5.1 presents the results of the search for new binary cyclic codes having lengths 195 ≤ *n* ≤ 255. The cyclic codes in this table are expressed in terms of the parity-check polynomial *h*(*x*), which is given in the last column by the exponents of β (excluding the conjugates). Note that the polynomial *m*(*x*), which is given in octal with the most significant bit on the left, is the minimal polynomial of β. In many cases, the entries of *C* and *C* are combined in a single row and this is indicated by "*a*/*b*" where the parameters *a* and *b* are for *C* and *C* , respectively. The notation "[0]" indicates that the polynomial (1 + *x*) is to be excluded from the parity-check polynomial of *C* - .

Some of these tabulated cyclic codes have a minimum Hamming distance which coincides with the lower bounds given in [12]. These are presented in Table 5.1 with the indicative mark "†".

In the late 1970s, computing the minimum distance of extended Quadratic Residue (QR) codes was posed as an open research problem by MacWilliams and Sloane [16]. Since then, the minimum distance of the extended QR code for the prime 199 has remained an open question. For this code, the bounds of the minimum distance were given as 16 − 32 in [16] and the lower bound was improved to 24 in [9]. Since 199 ≡ −1 (mod 8), the extended code is a doubly even self-dual code and its automorphism group contains a projective special linear group, which is known to be doubly transitive [16]. As a result, the minimum distance of the binary [199, 100] QR code is odd, i.e. *d* ≡ 3 (mod 4), and hence, *d* = 23, 27 or 31. Due to the cyclic property and the rate of this QR code [7], we can safely assume that a codeword of weight *d* has maximum information weight of *d*/2. If a weight *d* codeword does not satisfy this property, there must exist one of its cyclic shifts that does. After enumerating all codewords up to (and including) information weight 13 using grid computers, no codeword of weight less than 31 was found, implying that *d* of this binary [199, 100] QR code is indeed 31.

Without exploiting the property that *<sup>d</sup>* <sup>≡</sup> <sup>3</sup> (mod 4), an additional <sup>100</sup> 14 <sup>+</sup> <sup>100</sup> 15 codewords (88,373,885,354,647,200 codewords) would need to be enumerated in order to establish the same result and beyond available computer resources. Accordingly, we now know that there exists the [199, 99, 32] expurgated QR code and the [200, 100, 32] extended QR code.

It is interesting to note that many of the code improvements are contributed by low-rate cyclic codes of length 255 and there are 16 cases of this. Furthermore, it is also interesting that Table 5.1 includes a [255, 55, 70] cyclic code and a [255, 63, 65]


**Table 5.1** New binary cyclic codes

cyclic code, which are superior to the BCH codes of the same length and dimension. Both of these BCH codes have minimum distance 63 only.

## **5.5 Constructing New Codes from Existing Ones**

It is difficult to explicitly construct a new code with large minimum distance. However, the alternative approach, which starts from a known code which already has large minimum distance, seems to be more fruitful. Some of these methods are described below and in the following subsections, we present some new binary codes constructed using these methods, which improve on Brouwer's lower bound.

**Theorem 5.1** (Construction X) *Let B*<sup>1</sup> *and B*<sup>2</sup> *be* [*n*, *k*1, *d*1] *and* [*n*, *k*2, *d*2] *linear codes over* <sup>F</sup>*<sup>q</sup> respectively, where <sup>B</sup>*<sup>1</sup> <sup>⊃</sup> *<sup>B</sup>*<sup>2</sup> *(B*<sup>2</sup> *is a subcode of <sup>B</sup>*1*). Let <sup>A</sup>*

*be an* [*n*- , *k*<sup>3</sup> = *k*<sup>1</sup> − *k*2, *d*- ] *auxiliary code over the same field. There exists an* [*n* + *n*- , *k*1, min{*d*2, *d*<sup>1</sup> + *d*- }] *code <sup>C</sup><sup>X</sup> over* <sup>F</sup>*<sup>q</sup> .*

Construction X is due to Sloane et al. [23] and it basically adds a tail, which is a codeword of the auxiliary code *A* , to *B*<sup>1</sup> so that the minimum distance is increased. The effect of Construction X can be visualised as follows. Let *G<sup>C</sup>* be the generator matrix of code *C* . Since *B*<sup>1</sup> ⊃ *B*2, we may express *G<sup>B</sup>*<sup>1</sup> as

$$\mathbf{G}\_{\mathcal{\mathcal{\partial}}\_1} = \left[ \underbrace{\mathbf{G}\_{\mathcal{\partial}\_2}}\_{\mathbf{V}} \right].$$

where *V* is a (*k*<sup>1</sup> −*k*2)×*n* matrix which contains the cosets of *B*<sup>2</sup> in *B*1. We can see that the code generated by *GB*<sup>2</sup> has minimum distance *d*2, and the set of codewords {**v** + *c*2}, for all **v** ∈ *V* and all codewords *c*<sup>2</sup> generated by *GB*<sup>2</sup> , have minimum weight of *d*1. By appending non-zero weight codewords of *A* to the set {**v** + *c*2}, and all zeros codeword to each codeword of *B*2, we have a lengthened code of larger minimum distance, *C<sup>X</sup>* , whose generator matrix is given by

$$\mathbf{G}\_{\mathcal{H}\_{\mathcal{K}}} = \begin{bmatrix} & & & & & & \\ & \mathbf{G}\_{\mathcal{A}\mathcal{B}\_{\mathcal{Q}}} & & & \mathbf{0} \\ & & & & \\ \hline & \mathbf{V} & & \mathbf{G}\_{\mathcal{A}'} \end{bmatrix} . \tag{5.9}$$

We can see that, for binary cyclic linear codes of odd minimum distance, code extension by annexing an overall parity-check bit is an instance of Construction X. In this case, *B*<sup>2</sup> is the even-weight subcode of *B*<sup>1</sup> and the auxiliary code *A* is the trivial [1, 1, 1]<sup>2</sup> code.

Construction X given in Theorem5.1 considers a chain of two codes only. There also exists a variant of Construction X, called Construction XX, which makes use of Construction X twice and it was introduced by Alltop [1].

**Theorem 5.2** (Construction XX) *Consider three linear codes of the same length, B*<sup>1</sup> = [*n*, *k*1, *d*1]*, B*<sup>2</sup> = [*n*, *k*2, *d*2] *and B*<sup>3</sup> = [*n*, *k*3, *d*3] *where B*<sup>2</sup> ⊂ *B*<sup>1</sup> *and B*<sup>3</sup> ⊂ *B*1*. Let B*<sup>4</sup> *be an* [*n*, *k*4, *d*4] *linear code which is the intersection code of B*<sup>2</sup> *and B*3*, i.e. B*<sup>4</sup> = *B*<sup>2</sup> ∩ *B*3*. Using auxiliary codes A*<sup>1</sup> = [*n*1, *k*<sup>1</sup> − *k*2, *d*- <sup>1</sup>] *and A*<sup>2</sup> = [*n*2, *k*<sup>1</sup> − *k*3, *d*- <sup>2</sup>]*, there exists an* [*n* + *n*<sup>1</sup> + *n*2, *k*1, *d*] *linear code CX X , where d* = min{*d*4, *d*<sup>3</sup> + *d*- <sup>1</sup>, *d*<sup>2</sup> + *d*- <sup>2</sup>, *d*<sup>1</sup> + *d*- <sup>1</sup> + *d*- 2}*.*

The relationship among *B*1, *B*2, *B*<sup>3</sup> and *B*<sup>4</sup> can be illustrated as a lattice shown below [11].

Since *B*<sup>1</sup> ⊃ *B*2, *B*<sup>1</sup> ⊃ *B*3, *B*<sup>4</sup> ⊂ *B*<sup>2</sup> and *B*<sup>4</sup> ⊂ *B*3, the generator matrices of *B*2, *B*<sup>3</sup> and *B*<sup>1</sup> can be written as

*GB*<sup>2</sup> = ⎡ ⎢ ⎢ ⎣ *GB*<sup>4</sup> *V***2** ⎤ ⎥ ⎥ <sup>⎦</sup> , *<sup>G</sup>B*<sup>3</sup> <sup>=</sup> ⎡ ⎢ ⎢ ⎣ *GB*<sup>4</sup> *V***3** ⎤ ⎥ ⎥ ⎦ and *GB*<sup>1</sup> = ⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ *GB*<sup>4</sup> *V***2** *V***3** *V* ⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦

respectively, where *V<sup>i</sup>* , *i* = 2, 3, is the coset of *B*<sup>4</sup> in *B<sup>i</sup>* , and *V* contains the cosets of *B*<sup>2</sup> and *B*<sup>3</sup> in *B*1. Construction XX starts by applying Construction X to the pair of codes *B*<sup>1</sup> ⊃ *B*<sup>2</sup> using *A*<sup>1</sup> as the auxiliary code. The resulting code is *C<sup>X</sup>* = [*n* + *n*1, *k*1, min{*d*2, *d*<sup>1</sup> + *d*- <sup>1</sup>}], whose generator matrix is given by

$$\mathbf{G}\_{\mathcal{G}\_{\mathcal{K}}} = \begin{bmatrix} \mathbf{ } & \mathbf{ } & \mathbf{ } \\ \mathbf{ } & \mathbf{G}\_{\mathcal{A}\_{4}} & \mathbf{ } \\ \hline & \mathbf{V\_{2}} & \\ \hline & \mathbf{V\_{3}} & \\ \hline & \mathbf{V} & \mathbf{G}\_{\mathcal{A}\_{1}} \end{bmatrix}$$

.

This generator matrix can be rearranged such that the codewords formed from the first *n* coordinates are cosets of *B*<sup>3</sup> in *B*1. This rearrangement results in the following generator matrix of *C<sup>X</sup>* ,

$$\mathbf{G}\_{\mathcal{G}\_{\mathcal{X}}} = \begin{bmatrix} \mathbf{G}\_{\mathcal{A}\_{4}} & \mathbf{0} \\\\ \hline & \mathbf{V}\_{3} & \mathbf{G}\_{\mathcal{A}\_{1}}^{(1)} \\ \hline & \mathbf{V}\_{2} & \mathbf{0} \\ \hline & \mathbf{V} & \mathbf{G}\_{\mathcal{A}\_{1}}^{(2)} \end{bmatrix},$$

where *G<sup>A</sup>*<sup>1</sup> = *G*(1) *A*<sup>1</sup> *G*(2) *A*<sup>1</sup> . Next, using *A*<sup>2</sup> as the auxiliary code, applying Construction X to the pair*B*<sup>1</sup> ⊃ *B*<sup>3</sup> with the rearrangement above, we obtain*CX X* whose generator matrix is

$$\mathbf{G}\_{\mathcal{G}\_{\mathcal{K}}} = \begin{bmatrix} \mathbf{0} & \mathbf{0} & \mathbf{0} \\\\ \hline & \mathbf{V\_3} & \mathbf{G\_{\mathcal{K}\_1}} \\\\ \hline & \mathbf{V\_2} & \mathbf{0} \\\\ \hline & \mathbf{V} & \mathbf{G\_{\mathcal{K}\_1}} \end{bmatrix} \mathbf{G\_{\mathcal{K}\_2}}$$

.

While Constructions X and XX result in a code with increased length, there also exists a technique to obtain a shorter code with known minimum distance lower bounded from a longer code whose minimum distance and also that of its dual code are known explicitly. This technique is due to Sloane et al. [23] and it is called Construction Y1.

**Theorem 5.3** (Construction Y1) *Given an* [*n*, *k*, *d*] *linear code C , which has an* [*n*, *n* − *k*, *d*⊥] *C* <sup>⊥</sup> *as its dual, an* [*n* − *d*⊥, *k* − *d*<sup>⊥</sup> + 1, ≥ *d*] *code C can be constructed.*

Given an [*n*, *k*, *d*] code, with standard code shortening, we obtain an [*n* −*i*, *k* −*i*, ≥ *d*] code where *i* indicates the number of coordinates to shorten. With Construction Y1, however, we can gain an additional dimension in the resulting shortened code. This can be explained as follows. Without loss of generality, we can assume the parity-check matrix of *C* , which is also the generator matrix of *C* <sup>⊥</sup>, *H* contains a codeword *c*<sup>⊥</sup> of weight *d*⊥. If we delete the coordinates which form the support of *c*<sup>⊥</sup> from *H*, now *H* becomes an (*n* − *k*) × *n* − *d*<sup>⊥</sup> matrix and there is a row which contains all zeros among these *n* − *k* rows. Removing this all zeros row, we have an (*n* − *k* − 1) × (*n* − *d*⊥) matrix *H*- , which is the parity-check matrix of an [*n* − *d*⊥, *n* − *d*<sup>⊥</sup> − (*n* − *k* − 1), ≥ *d*]=[*n* − *d*⊥, *k* − *d*<sup>⊥</sup> + 1, ≥ *d*] code *C* - .

#### *5.5.1 New Binary Codes from Cyclic Codes of Length 151*

Amongst all of the cyclic codes in Table 4.3, those of length 151 have minimum distances that were found to have the highest number of matches against Brouwer's [6] lower bounds. This shows that binary cyclic codes of length 151 are indeed good codes. Since 151 is a prime, cyclic codes of this length are special as all of the irreducible factors of *x* <sup>151</sup> − 1, apart from 1 + *x*, have a fixed degree of 15. Having a fixed degree implies that duadic codes [14], which includes the quadratic residue codes, also exist for this length. Due to their large minimum distance, they are good candidate component codes for Constructions X and XX.


**Table 5.2** Order of β in an optimum chain of [151, *ki*, *di*] cyclic codes

**Definition 5.2** (*Chain of Cyclic Codes*) A pair of cyclic codes, *C*<sup>1</sup> = [*n*, *k*1, *d*1] and *C*<sup>2</sup> = [*n*, *k*2, *d*2] where *k*<sup>1</sup> > *k*2, is nested, denoted *C*<sup>1</sup> ⊃ *C*2, if all roots of *C*<sup>1</sup> are contained in *C*2. Here, the roots refer to those of the generator polynomial. By appropriate arrangement of their roots, cyclic codes of the same length may be partitioned into a sequence of cyclic codes *C*<sup>1</sup> ⊃ *C*<sup>2</sup> ⊃ ... ⊃ *C<sup>t</sup>* . This sequence of codes is termed a chain of cyclic codes.

Given all cyclic codes of the same length, it is important to order the roots of these cyclic codes so that an optimum chain can be obtained. For all cyclic codes of length 151 given in Table 4.3, whose generator polynomial contains 1+*x* as a factor, an ordering of roots (excluding the conjugate roots) shown in Table 5.2 results in an optimum chain arrangement. Here β is a primitive 151st root of unity. Similarly, a chain which contains cyclic codes, whose generator polynomial does not divide 1 + *x*, can also be obtained.

All the constituent codes in the chain*C*<sup>1</sup> ⊃ *C*<sup>2</sup> ⊃ ... ⊃ *C*<sup>10</sup> of Table 5.2 are cyclic. Following Grassl [10], a chain of non-cyclic subcodes may also be constructed from a chain of cyclic codes. This is because for a given generator matrix of an [*n*, *k*, *d*] cyclic code (not necessarily in row-echelon form), removing the last *i* rows of this matrix will produce an [*n*, *k* − *i*, ≥ *d*] code which will no longer be cyclic. As a consequence, with respect to Table 5.2, there exists [151, *k*, *d*] linear codes, for 15 ≤ *k* ≤ 150.

Each combination of pairs of codes in the [151, *k*, *d*] chain is a nested pair which can be used as component codes for Construction X to produce another linear code with increased distance. There is a chance that the minimum distance of the resulting linear code is larger than that of the best-known codes for the same length and dimension. In order to find the existence of such cases, the following exhaustive approach has been taken. There are <sup>150</sup>−15+<sup>1</sup> 2 <sup>=</sup> <sup>136</sup> 2 distinct pair of codes in the above chain of linear codes, and each pair say *C*<sup>1</sup> = [*n*, *k*1, *d*1] ⊃ *C*<sup>2</sup> = [*n*, *k*2, *d*1], is combined using Construction X with an auxiliary code *A* , which is an [*n*- , *k*1−*k*2, *d*- ] best-known linear code. The minimum distance of the resulting code *C<sup>X</sup>* is then compared to that of the best-known linear code for the same length and dimension to check for a possible improvement. Two improvements were obtained and they are tabulated in in the top half of Table 5.3.

In the case where *k*<sup>1</sup> − *k*<sup>2</sup> is small, the minimum distance of *C*1, i.e. *d*1, obtained from a chain of linear codes, can be unsatisfactory. We can improve *d*<sup>1</sup> by augmenting *C*<sup>1</sup> with a vector **v** of length *n*, i.e. add **v** as an additional row in *G<sup>C</sup>*<sup>2</sup> . In finding a vector **v** that can maximise the minimum distance of the enlarged code, we have adopted the following procedure. Choose a code *C*<sup>2</sup> = [*n*, *k*2, *d*2] that has sufficiently high minimum distance.

Assuming that *GC*<sup>2</sup> is in reduced-echelon format, generate a vector **v** which satisfies the following conditions:


The vector **v** is then appended to *GC*<sup>2</sup> as an additional row. The minimum distance of the resulting code is computed using Algorithm 5.1. A threshold is applied during the minimum distance evaluation and a termination is called whenever: *dub* ≤ *d*1, in which case a different **v** is chosen and Algorithm 5.1 is restarted; or *d*<sup>1</sup> < *dub* ≤ *dlb* which means that an improvement has been found.

Using this approach, we found two new linear codes, [151, 77, 20] and [151, 62, 27], which have higher minimum distance than the corresponding codes obtained from a chain of nested cyclic codes. These two codes are obtained starting from the cyclic code [151, 76, 23]–which has roots {β, β<sup>5</sup>, β<sup>15</sup>, β<sup>35</sup>, β<sup>37</sup>} and the cyclic code [151, 61, 31]–which has roots{β, β<sup>3</sup>, β<sup>5</sup>, β<sup>11</sup>, β<sup>15</sup>, β<sup>37</sup>}, respectively and therefore

$$[151, 77, 20] \supset [151, 76, 23]$$

and

$$[1\mathfrak{Is}1,6\mathfrak{L},2\mathfrak{T}] \supset [1\mathfrak{Is}1,6\mathfrak{l},3\mathfrak{l}].$$

The second half of Table 5.3 shows the foundation codes for these new codes.

Note that when searching for the [151, 62, 27] code, we exploited the property that the [152, 61, 32] code obtained by extending the [151, 61, 31] cyclic code is doubly even. We chose the additional vector **v** such that extending the enlarged code [151, 62, *d*1] yields again a doubly even code. This implies the congruence *d*<sup>1</sup> = 0, 3 mod 4 for the minimum distance of the enlarged code. Hence, it is sufficient to establish a lower bound *dlb* = 25 using Algorithm 5.1 to show that *d*<sup>1</sup> ≥ 27.

Furthermore, we also derived two different codes, *C*<sup>2</sup> = [151, 62, 27] ⊂ *C*<sup>1</sup> and *C*<sup>3</sup> = [151, 62, 27] ⊂ *C*1, where *C*<sup>1</sup> = [151, 63, 23] and *C*<sup>4</sup> = *C*<sup>2</sup> ∩ *C*<sup>3</sup> = [151, 61, 31]. Using Construction XX, a [159, 63, 31] code is obtained, see Table 5.4.


**Table 5.3** New binary codes from Construction X and cyclic codes of length 151

**Table 5.4** New binary code from Construction XX and cyclic codes of length 151


# *5.5.2 New Binary Codes from Cyclic Codes of Length* **≥** *199*

We know from Table 5.1 that there exists an outstanding [199, 100, 31] cyclic code. The extended code, obtained by annexing an overall parity-check bit, is a [200, 100, 32] doubly even self-dual code. As the name implies, being self-dual we know that the dual code has minimum distance 32. By using Construction Y1 (Theorem 5.3), a [168, 69, 32] new, improved binary code is obtained. The minimum distance of the [168, 69] previously considered best-known binary linear code is 30.

Considering cyclic codes of length 205, in addition to a [205, 61, 46] cyclic code (see Table 5.1), there also exists a [205, 61, 45] cyclic code which contains a [205, 60, 48] cyclic code as its even-weight subcode. Applying Construction X (Theorem 5.1) to the [205, 61, 45]⊃[205, 60, 48] pair of cyclic codes with a repetition code of length 3 as the auxiliary code, a [208, 61, 48] new binary linear code is constructed, which improves Brouwer's lower bound distance by 2.

Furthermore, by analysing the dual codes of the [255, 65, 63] cyclic code in Table 5.1 and its [255, 64, 64] even weight subcode it was found that both have minimum distance of 8. Applying Construction Y1 (Theorem 5.3), we obtain the [247, 57, 64] and the [247, 58, 63] new binary linear codes, which improves on Brouwer's lower bound distances by 2 and 1, respectively.

# **5.6 Concluding Observations on Producing New Binary Codes**

In the search for error-correcting codes with large minimum distance, having a fast, efficient algorithm to compute the exact minimum distance of a linear code is important. The evolution of various algorithms to evaluate the minimum distance of a binary linear code, from the naive approach to Zimmermann's efficient approach, have been explored in detail. In addition to these algorithms, Chen's approach in computing the minimum distance of binary cyclic codes is a significant breakthrough.

The core basis of a minimum distance evaluation algorithm is codeword enumeration. As we increase the weight of the information vector, the number of codewords grows exponentially. Zimmermann's very useful algorithm may be improved by omitting generator matrices with overlapping information sets that never contribute to the lower bound throughout the enumeration. Early termination is important in the event that a new minimum distance is found that meets the lower bound value of the previous enumeration step. In addition, if the code under consideration has the property that every codeword weight is divisible by 2 or 4, the number of codewords that need to be enumerated can be considerably reduced.

With some simple modifications, these algorithms can also be used to collect and hence, count all codewords of a given weight to determine all or part of the weight spectrum of a code.

Given a generator matrix, codewords may be efficiently generated by taking linear combinations of rows of this matrix. This implies the faster we can generate the combinations, the less time the minimum distance evaluation algorithm will take. One such efficient algorithm to generate these combinations is called the revolvingdoor algorithm. The revolving-door algorithm has a nice property that allows the problem of generating combinations to be readily implemented in parallel. Having an efficient minimum distance computation algorithm, which can be computed in parallel on multiple computers has allowed us to extend earlier research results [8, 21, 22] in the evaluation of the minimum distance of cyclic codes. In this way, we obtained the highest minimum distance attainable by all binary cyclic codes of odd lengths from 129 to 189. We found that none of these cyclic codes has a minimum distance that exceeds the minimum distance of the best-known linear codes of the same length and dimension, which are given as lower bounds in [6]. However there are 134 cyclic codes that meet the lower bounds, see Sect. 5.3 and encoders and decoders may be easier to implement for the cyclic codes.

Having an efficient, multiple computer based, minimum distance computation algorithm also allowed us to search for the existence of binary cyclic codes of length longer than 189 which are improvements to Brouwer's lower bounds. We found 35 of these cyclic codes, namely

[195, 66, 42], [195, 67, 41], [195, 68, 40], [195, 69, 39], [195, 73, 38], [195, 74, 38], [195, 75, 37], [195, 78, 36], [199, 99, 32], [199, 100, 32], [205, 60, 48], [205, 61, 46], [215, 70, 46], [215, 71, 46], [223, 74, 48], [223, 75, 47], [229, 76, 48], [233, 58, 60], [233, 59, 60], [255, 48, 76], [255, 49, 75], [255, 50, 74], [255, 51, 74], [255, 52, 72], [255, 53, 72], [255, 54, 70], [255, 55, 70], [255, 56, 68], [255, 57, 68], [255, 58, 66], [255, 60, 66], [255, 62, 66], [255, 63, 65], [255, 64, 64], [255, 65, 63].

From the cyclic codes above, using Construction X to lengthen the code or Construction Y1 to shorten the code, four additional improvements to [6] lower bound are found, namely







**Table 5.7**



**Table 5.8**


**Table 5.9** Updated minimum distance lower bounds of linear codes *C* = [*n*, *k*] for 225 ≤ *n* ≤ 256 and 63 ≤ *k* ≤ 76

[168, 69, 32], [208, 61, 48], [247, 57, 64], [247, 58, 63] .

Five new linear codes, which are derived from cyclic codes of length 151, have also been constructed. These new codes, which are produced by Constructions X and XX, are

[154, 77, 23], [155, 62, 31], [159, 63, 31], [171, 60, 35], [174, 72, 31] .

Given an [*n*, *k*, *d*] code *C* , where *d* is larger than the minimum distance of the best-known linear code of the same *n* and *k*, it is possible to obtain more codes, whose minimum distance is still larger than that of the corresponding best-known linear code, by recursively extending (annexing parity-checks), puncturing and/or shortening *C* . For example, consider the new code [168, 69, 32] as a starting point. New codes can be obtained by annexing parity-check bits [168 + *i*, 69, 32], for 1 ≤ *i* ≤ 3. With puncturing by one bit a [167, 69, 31] new code is obtained by shortening [168 − *i*, 69 − *i*, 32], for 1 ≤ *i* ≤ 5, 5 new codes are obtained with a minimum distance of 32. More improvements are also obtained by shortening these extended and punctured codes. Overall, with all of the new codes described and presented in this chapter, there are some 901 new binary linear codes which improve on Brouwer's lower bounds. The updated lower bounds are tabulated in Tables 5.5, 5.6, 5.7, 5.8 and 5.9 in Appendix "Improved Lower Bounds of the Minimum Distance of Binary Linear Codes".

## **5.7 Summary**

Methods have been described and presented which may be used to determine the minimum Hamming distance and weight distribution of a linear code. These are the main tools for testing new codes which are candidates for improvements to currently known, best codes. Several efficient algorithms for computing the minimum distance and weight distribution of linear codes have been explored in detail. The many different methods of constructing codes have been described, particularly those based on using known good or outstanding codes as a construction basis. Using such methods, several hundred new codes have been presented or described which are improvements to the public database of best, known codes.

For cyclic codes, which have implementation advantages over other codes, many new outstanding codes have been presented including the determination of a table giving the code designs and highest attainable minimum distance of all binary cyclic codes of odd lengths from 129 to 189. It has been shown that outstanding cyclic codes may be used as code components to produce new codes that are better than the previously thought best codes, for the same code length and code rate.

## **Appendix**

# **Improved Lower Bounds of the Minimum Distance of Binary Linear Codes**

The following tables list the updated lower bounds of minimum distance of linear codes over F2. These improvements—there are 901 of them in total—are due to the new binary linear codes described above. In the tables, entries marked with *C* refer to cyclic codes, those marked with *X*, *X X* and *Y* 1 refer to codes obtained from Constructions X, XX and Y1, respectively. Similarly, entries marked with *E*, *P* and *S* denote [*n*, *k*, *d*] codes obtained by extending (annexing an overall paritycheck bit) to (*n* − 1, *k*, *d*- ) codes, puncturing (*n* + 1, *k*, *d* + 1) codes and shortening (*n* +1, *k* +1, *d*) codes, respectively. Unmarked entries are the original lower bounds of Brouwer [6].

## **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the book's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the book's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Chapter 6 Lagrange Codes**

#### **6.1 Introduction**

Joseph Louis Lagrange was a famous eighteenth century Italian mathematician [1] credited with minimum degree polynomial interpolation amongst his many other achievements. Polynomial interpolation may be applied straightforwardly using Galois Fields and provides the basis for an extensive family of error-correcting codes. For a Galois Field *GF*(2*<sup>m</sup>*), the maximum code length is 2*<sup>m</sup>*+1, consisting of 2*<sup>m</sup>* data symbols and 2*<sup>m</sup>* parity symbols. Many of the different types of codes originated by Goppa [3, 4] may be linked to Lagrange codes.

## **6.2 Lagrange Interpolation**

The interpolation polynomial, *p*(*z*), is constructed such that the value of the polynomial for each element of *GF*(2*<sup>m</sup>*) is equal to a data symbol *xi* also from *GF*(2*<sup>m</sup>*). Thus,

$$\begin{bmatrix} p(0) & = & \mathbf{x}\_0 \\ p(1) & = & \mathbf{x}\_1 \\ p(\alpha^1) & = & \mathbf{x}\_2 \\ p(\alpha^2) & = & \mathbf{x}\_3 \\ \vdots & \cdots & \cdots \\ p(\alpha^{2^m - 3}) & = & \mathbf{x}\_{2^m - 2} \\ p(\alpha^{2^m - 2}) & = & \mathbf{x}\_{2^m - 1} \end{bmatrix}$$

Using the method of Lagrange, the interpolation polynomial is constructed as a summation of 2*<sup>m</sup>* polynomials, each of degree 2*<sup>m</sup>* − 1. Thus,

**Table 6.1** *GF*(8) extension field defined by <sup>1</sup> <sup>+</sup> <sup>α</sup><sup>1</sup> <sup>+</sup> <sup>α</sup><sup>3</sup> <sup>=</sup> <sup>0</sup>

<sup>α</sup><sup>0</sup> <sup>=</sup> <sup>1</sup> <sup>α</sup><sup>1</sup> <sup>=</sup> <sup>α</sup> <sup>α</sup><sup>2</sup> <sup>=</sup> <sup>α</sup><sup>2</sup> <sup>α</sup><sup>3</sup> <sup>=</sup> <sup>1</sup> <sup>+</sup> <sup>α</sup> <sup>α</sup><sup>4</sup> <sup>=</sup> <sup>α</sup> <sup>+</sup> <sup>α</sup><sup>2</sup> <sup>α</sup><sup>5</sup> <sup>=</sup> <sup>1</sup> <sup>+</sup> <sup>α</sup> <sup>+</sup> <sup>α</sup><sup>2</sup> <sup>α</sup><sup>6</sup> <sup>=</sup> <sup>1</sup> <sup>+</sup> <sup>α</sup><sup>2</sup>

$$p(\mathbf{z}) = \sum\_{i=0}^{2^n - 1} p\_i(\mathbf{z}) \tag{6.1}$$

where

$$p\_i(z) = x\_i \frac{z}{\alpha^i} \prod\_{j=0, j \neq i}^{j=2^n-2} \frac{z - \alpha^j}{\alpha^i - \alpha^j} \quad \text{for} \quad i \neq 0 \tag{6.2}$$

and

$$p\_0(z) = x\_0 \prod\_{j=0}^{j=2^n-2} \frac{z-\alpha^j}{(-\alpha^j)}\tag{6.3}$$

The idea is that each of the *pi*(*z*) polynomials has a value of zero for *z* equal to each element of *GF*(2*<sup>m</sup>*), except for the one element corresponding to *i* (namely α*<sup>i</sup>*−<sup>1</sup> except for *i* = 0).

A simpler form for the polynomials *pi*(*z*) is given by

$$p\_i(z) = x\_i \frac{(\alpha^i - \alpha^j)}{\alpha^i(\alpha^i - 1)} \frac{z(z^{2^n - 1} - 1)}{z - \alpha^j} \quad \text{for} \quad i \neq 0 \tag{6.4}$$

and

$$p\_0(z) = -x\_0(z^{2^n - 1} - 1)\tag{6.5}$$

In an example using *GF*(2<sup>3</sup>), where all the nonzero field elements may express as a power of a primitive root α of the primitive polynomial 1 + *x* + *x*3, modulo 1 + *x*7. The nonzero field elements are tabulated in Table 6.1.

All of the 8 polynomials *pi*(*z*) are given below

$$\begin{array}{l} p\_0(z) = \mathbf{x}\_0(z^\top + 1) \\ p\_1(z) = \mathbf{x}\_1(z^\top + z^6 + z^5 + \alpha^z + z^4 + \alpha^z^3 + \alpha^z^2 + \alpha^z^2) \\ p\_2(z) = \mathbf{x}\_2(z^\top + \alpha z^6 + \alpha^2 z^5 + \alpha^3 z^4 + \alpha^4 z^3 + \alpha^5 z^2 + \alpha^6 z) \\ p\_3(z) = \mathbf{x}\_3(z^\top + \alpha^2 z^6 + \alpha^4 z^5 + \alpha^6 z^4 + \alpha z^3 + \alpha^3 z^2 + \alpha^4 z) \\ p\_4(z) = \mathbf{x}\_4(z^\top + \alpha^3 z^6 + \alpha^6 z^5 + \alpha^2 z^4 + \alpha^5 z^3 + \alpha z^2 + \alpha^4 z) \\ p\_5(z) = \mathbf{x}\_5(z^\top + \alpha^4 z^6 + \alpha z^5 + \alpha^5 z^4 + \alpha^2 z^3 + \alpha^6 z^2 + \alpha^3 z) \\ p\_6(z) = \mathbf{x}\_6(z^\top + \alpha^5 z^6 + \alpha^3 z^5 + \alpha z^4 + \alpha^6 z^3 + \alpha^4 z^2 + \alpha^2 z) \\ p\_7(z) = \mathbf{x}\_7(z^\top + \alpha^6 z^6 + \alpha^5 z^5 + \alpha^4 z^4 + \alpha^3 z^3 + \alpha^2 z^2 + \alpha z) \end{array}$$

These polynomials are simply summed to produce the Lagrange interpolation polynomial *p*(*z*)

*p*(*z*) = *z*<sup>7</sup>(*x*<sup>0</sup> +*x*<sup>1</sup> +*x*<sup>2</sup> +*x*<sup>3</sup> +*x*<sup>4</sup> +*x*<sup>5</sup> +*x*<sup>6</sup> +*x*7) + *z*<sup>6</sup>(α*x*<sup>1</sup> +α<sup>2</sup>*x*<sup>2</sup> +α<sup>3</sup>*x*<sup>3</sup> +α<sup>4</sup>*x*<sup>4</sup> +α<sup>5</sup>*x*<sup>5</sup> +α<sup>6</sup>*x*<sup>6</sup> +*x*7) + *z*<sup>5</sup>(α<sup>2</sup>*x*<sup>1</sup> +α<sup>4</sup>*x*<sup>2</sup> +α<sup>6</sup>*x*<sup>3</sup> +α*x*<sup>4</sup> +α<sup>3</sup>*x*<sup>5</sup> +α<sup>5</sup>*x*<sup>6</sup> +*x*7) + *z*<sup>4</sup>(α<sup>3</sup>*x*<sup>1</sup> +α<sup>6</sup>*x*<sup>2</sup> +α<sup>2</sup>*x*<sup>3</sup> +α<sup>5</sup>*x*<sup>4</sup> +α*x*<sup>5</sup> +α<sup>4</sup>*x*<sup>6</sup> +*x*7) + *z*<sup>3</sup>(α<sup>4</sup>*x*<sup>1</sup> +α*x*<sup>2</sup> +α<sup>5</sup>*x*<sup>3</sup> +α<sup>2</sup>*x*<sup>4</sup> +α<sup>6</sup>*x*<sup>5</sup> +α<sup>3</sup>*x*<sup>6</sup> +*x*7) + *z*<sup>2</sup>(α<sup>5</sup>*x*<sup>1</sup> +α<sup>3</sup>*x*<sup>2</sup> +α*x*<sup>3</sup> +α<sup>6</sup>*x*<sup>4</sup> +α<sup>4</sup>*x*<sup>5</sup> +α<sup>2</sup>*x*<sup>6</sup> +*x*7) + *z*(α<sup>6</sup>*x*<sup>1</sup> +α<sup>5</sup>*x*<sup>2</sup> +α<sup>4</sup>*x*<sup>3</sup> +α<sup>3</sup>*x*<sup>4</sup> +α<sup>2</sup>*x*<sup>5</sup> +α*x*<sup>6</sup> +*x*7) + *x*<sup>0</sup> (6.6)

This can be easily verified by evaluating *p*(*z*) for each element of *GR*(2<sup>3</sup>) to produce

$$\begin{array}{l} p(0) = x\_0 \\ p(1) = x\_1 \\ p(\alpha) = x\_2 \\ p(\alpha^2) = x\_3 \\ p(\alpha^3) = x\_4 \\ p(\alpha^4) = x\_5 \\ p(\alpha^5) = x\_6 \\ p(\alpha^6) = x\_7 \end{array}$$

#### **6.3 Lagrange Error-Correcting Codes**

The interpolation polynomial *p*(*z*) may be expressed in terms of its coefficients and used as a basis for defining error-correcting codes.

$$p(z) = \sum\_{i=0}^{2^n - 1} \mu\_i z^i \tag{6.7}$$

It is clear that an interpolation equation and a parity check equation are equivalent, and for the 8 identities given by the interpolation polynomial we may define 8 parity check equations:

$$\begin{array}{ll} \alpha\_0 + p(0) &= 0 \\ \alpha\_1 + p(1) &= 0 \\ \alpha\_2 + p(\alpha) &= 0 \\ \alpha\_3 + p(\alpha^2) &= 0 \\ \alpha\_4 + p(\alpha^3) &= 0 \\ \alpha\_5 + p(\alpha^4) &= 0 \\ \alpha\_6 + p(\alpha^5) &= 0 \\ \alpha\_7 + p(\alpha^6) &= 0 \end{array} \tag{6.8}$$

The 8 parity check equations become

*x*0+ μ<sup>0</sup> = 0 *x*1+ μ1+ μ2+ μ3+ μ4+ μ5+ μ6+ μ<sup>7</sup> = 0 *x*2+ αμ1+ α<sup>2</sup>μ2+ α<sup>3</sup>μ3+ α<sup>4</sup>μ4+ α<sup>5</sup>μ5+ α<sup>6</sup>μ6+ μ<sup>7</sup> = 0 *x*3+ α<sup>2</sup>μ1+ α<sup>4</sup>μ2+ α<sup>6</sup>μ3+ αμ4+ α<sup>3</sup>μ5+ α<sup>5</sup>μ6+ μ<sup>7</sup> = 0 *x*4+ α<sup>3</sup>μ1+ α<sup>6</sup>μ2+ α<sup>2</sup>μ3+ α<sup>5</sup>μ4+ αμ5+ α<sup>4</sup>μ6+ μ<sup>7</sup> = 0 *x*5+ α<sup>4</sup>μ1+ αμ2+ α<sup>5</sup>μ3+ α<sup>2</sup>μ4+ α<sup>6</sup>μ5+ α<sup>3</sup>μ6+ μ<sup>7</sup> = 0 *x*6+ α<sup>5</sup>μ1+ α<sup>3</sup>μ2+ αμ3+ α<sup>6</sup>μ4+ α<sup>4</sup>μ5+ α<sup>2</sup>μ6+ μ<sup>7</sup> = 0 *x*7+ α<sup>6</sup>μ1+ α<sup>5</sup>μ2+ α<sup>4</sup>μ3+ α<sup>3</sup>μ4+ α<sup>2</sup>μ5+ αμ6+ μ<sup>7</sup> = 0 (6.9)

A number of different codes may be derived from these equations. Using the first 4 equations, apart from the first, and setting *x*<sup>2</sup> and *x*<sup>3</sup> equal to 0, the following parity check matrix is obtained, producing a (9, 5) code:

$$\mathbf{H}\_{\mathfrak{P},\mathfrak{S}} = \begin{bmatrix} 1 \ 0 \ 1 \ 1 \ 1 \ 1 \ 1 \ 1 \ 1 \ 1 \ 1 \ 1 \\ 0 \ 0 \ \alpha \ \alpha^2 \ \alpha^3 \ \alpha^4 \ \alpha^5 \ \alpha^6 \ 1 \\ 0 \ 0 \ \alpha^2 \ \alpha^4 \ \alpha^6 \ \alpha \ \alpha^3 \ \alpha^5 \ 1 \\ 0 \ 1 \ \alpha^3 \ \alpha^6 \ \alpha^2 \ \alpha^5 \ \alpha \ \alpha^4 \ 1 \end{bmatrix}$$

Rearranging the order of the columns produces a parity check matrix, **H**ˆ identical to the MDS (9, 5, 5) code based on the doubly extended Reed–Solomon code [7].

$$
\hat{\mathbf{H}}\_{(\mathfrak{P},\mathfrak{S},\mathfrak{S})} = \begin{bmatrix}
1 & 1 & 1 & 1 & 1 & 1 & 1 & 0 \\
1 & \alpha & \alpha^2 \ \alpha^3 \ \alpha^4 \ \alpha^5 \ \alpha^6 \ 0 & 0 \\
1 \ \alpha^2 \ \alpha^4 \ \alpha^6 \ \alpha & \alpha^3 \ \alpha^5 \ 0 & 0 \\
1 \ \alpha^3 \ \alpha^6 \ \alpha^2 \ \alpha^5 \ \alpha & \alpha^4 \ 0 & 1
\end{bmatrix}.
$$

Correspondingly, we know that the code with parity check matrix, **H**<sup>9</sup>,<sup>5</sup> derived from the Lagrange interpolating polynomial is MDS and has a minimum Hamming distance of 5. Useful, longer codes can also be obtained. Adding the first row of (6.9) to the second equation of the above example and setting *x*<sup>0</sup> equal to *x*1, a parity check matrix for a (10, 6) code is obtained:

$$\mathbf{H}\_{\mathbf{10},6} = \begin{vmatrix} 0 \ 1 \ 0 \ 1 \ 1 \ 1 \ 1 \ 1 \ 1 \ 1 \ 1 \ 1 \ 1 \\ 1 \ 1 \ 0 \ \alpha \ \alpha^2 \ \alpha^3 \ \alpha^4 \ \alpha^5 \ \alpha^6 \ 1 \\ 0 \ 0 \ 0 \ \alpha^2 \ \alpha^4 \ \alpha^6 \ \alpha \ \alpha^3 \ \alpha^5 \ 1 \\ 0 \ 0 \ 1 \ \alpha^3 \ \alpha^6 \ \alpha^2 \ \alpha^5 \ \alpha \ \alpha^4 \ 1 \end{vmatrix}$$

It is straightforward to map any code with *GF*(2*<sup>m</sup>*) symbols into a binary code by simply mapping each *GF*(2*<sup>m</sup>*) symbol into a *m* × *m* binary matrix using the *GF*(2*<sup>m</sup>*) table of field elements. If the codeword coordinate is α*<sup>i</sup>* , the coordinate is replaced with the matrix, where each column is the binary representation of the *GF*(2*<sup>m</sup>*) symbol:

$$\left[ \alpha^i \; \alpha^{i+1} \; \alpha^{i+2} \; \dots \; \alpha^{i+m-1} \right]$$

As an example for *GF*(2<sup>3</sup>), if the codeword coordinate is α3, the symbol is replaced with the binary matrix whose columns are the binary values of α3, α4, and α<sup>5</sup> using Table 6.1.

$$
\begin{bmatrix} 1 \ 0 \ 1 \\ 1 \ 1 \ 1 \\ 0 \ 1 \ 1 \end{bmatrix}
$$

In another example the symbol α<sup>0</sup> produces the identity matrix

$$
\begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{bmatrix}
$$

The (10, 6) GF(8) code above forms a (30, 18) binary code with parity check matrix

**H30**,**<sup>18</sup>** = ⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ 000100000100100100100100100100 000010000010010010010010010010 000001000001001001001001001001 10010000000101010 1011111110100 010010000101011111110100001010 001001000010101011111110100001 000000000010011110001101111100 000000000011110001101111100010 000000000101111100010011110001 000000100101110010111001011100 000000010111001011100101110010 000000001011100101110010110001 ⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦

The minimum Hamming distance of this code has been evaluated and it turns out to be 4. Methods for evaluating the minimum Hamming distance are described in Chap. 5. Consequently, extending the length of the code by one symbol has reduced the *dmin* by 1. The *dmin* may be increased by 2 by adding an overall parity bit to the first two symbols plus an overall parity bit to all bits to produce a (32, 18, 6) code with parity check matrix

**H32**,**<sup>18</sup>** = ⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ 00010000010010010010010010010000 00001000001001001001001001001000 00000100000100100100100100100100 10010000000101010101111111010000 01001000010101111111010000101000 00100100001010101111111010000100 11111100000000000000000000000010 00000000001001111000110111110000 00000000001111000110111110001000 00000000010111110001001111000100 00000010010111001011100101110000 00000001011100101110010111001000 00000000101110010111001011000100 11111111111111111111111111111111 ⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦

This is a good code as weight spectrum analysis shows that it has the same minimum Hamming distance as the best known (32, 18, 6) code [5]. It is interesting to note that in extending the length of the code beyond the MDS length of 9 symbols for *GF*(2<sup>3</sup>), two *weak* symbols are produced but these are counterbalanced by adding an overall parity bit to these two symbols.

# **6.4 Error-Correcting Codes Derived from the Lagrange Coefficients**

In another approach, we may set some of the equations defining the Lagrange polynomial coefficients to zero, and then use these equations to define parity checks for the code. As an example, using *GF*(2<sup>3</sup>), from Eq. (6.6) we may set coefficients μ7, μ6, μ5, μ<sup>4</sup> and μ<sup>3</sup> equal to zero. The parity check equations become

$$\begin{array}{ccccccccc}\mathbf{x}\_{0} & +\mathbf{x}\_{1} & +\mathbf{x}\_{2} & +\mathbf{x}\_{3} & +\mathbf{x}\_{4} & +\mathbf{x}\_{5} & +\mathbf{x}\_{6} & +\mathbf{x}\_{7} & = \mathbf{0} \\ \alpha\mathbf{x}\_{1} & +\alpha^{2}\mathbf{x}\_{2} & +\alpha^{3}\mathbf{x}\_{3} & +\alpha^{4}\mathbf{x}\_{4} & +\alpha^{5}\mathbf{x}\_{5} & +\alpha^{6}\mathbf{x}\_{6} & +\mathbf{x}\_{7} & = \mathbf{0} \\ \alpha^{2}\mathbf{x}\_{1} & +\alpha^{4}\mathbf{x}\_{2} & +\alpha^{6}\mathbf{x}\_{3} & +\alpha\mathbf{x}\_{4} & +\alpha^{3}\mathbf{x}\_{5} & +\alpha^{5}\mathbf{x}\_{6} & +\mathbf{x}\_{7} & = \mathbf{0} \\ \alpha^{3}\mathbf{x}\_{1} & +\alpha^{6}\mathbf{x}\_{2} & +\alpha^{2}\mathbf{x}\_{3} & +\alpha^{5}\mathbf{x}\_{4} & +\alpha\mathbf{x}\_{5} & +\alpha^{4}\mathbf{x}\_{6} & +\mathbf{x}\_{7} & = \mathbf{0} \\ \alpha^{4}\mathbf{x}\_{1} & +\alpha\mathbf{x}\_{2} & +\alpha^{5}\mathbf{x}\_{3} & +\alpha^{2}\mathbf{x}\_{4} & +\alpha^{6}\mathbf{x}\_{5} & +\alpha^{3}\mathbf{x}\_{6} & +\mathbf{x}\_{7} & = \mathbf{0} \end{array} \tag{6.10}$$

and the corresponding parity check matrix is

$$\mathbf{H} \mathbf{s}, \mathbf{s} = \begin{bmatrix} 1 & 1 & 1 & 1 & 1 & 1 & 1 \\ 0 & \alpha & \alpha^2 \ \alpha^3 \ \alpha^4 \ \alpha^5 \ \alpha^6 \ 1 \\ 0 \ \alpha^2 \ \alpha^4 \ \alpha^6 \ \alpha & \alpha^3 \ \alpha^5 \ 1 \\ 0 \ \alpha^3 \ \alpha^6 \ \alpha^2 \ \alpha^5 \ \alpha & \alpha^4 \ 1 \\ 0 \ \alpha^4 \ \alpha & \alpha^5 \ \alpha^2 \ \alpha^6 \ \alpha^3 \ 1 \end{bmatrix} \tag{6.11}$$

As a *GF*(2<sup>3</sup>) code, this code is MDS with a *dmin* of 6 and equivalent to the extended Reed–Solomon code. As a binary code with the following parity check matrix a (24, 9, 8) code is obtained. This is a good code as it has the same minimum Hamming distance as the best known (24, 9, 8) code [5].

$$\mathbf{H}\_{24,9} = \begin{bmatrix} 1 \ 0 \ 0 \ 1 \ 0 \ 0 \ 1 \ 0 \ 0 \ 1 \ 0 \ 0 \ 1 \ 0 \ 0 \ 1 \ 0 \ 0 \ 1 \ 0 \ 0 \ 1 \ 0 \ 0 \\ 0 \ 1 \ 0 \ 0 \ 1 \ 0 \ 0 \ 1 \ 0 \ 0 \ 1 \ 0 \ 0 \ 1 \ 0 \ 0 \ 1 \ 0 \ 0 \ 1 \ 0 \ 0 \ 1 \ 0 \\ 0 \ 0 \ 1 \ 0 \ 0 \ 1 \ 0 \ 0 \ 1 \ 0 \ 0 \ 1 \ 0 \ 0 \ 1 \ 0 \ 0 \ 1 \ 0 \ 0 \ 1 \ 0 \ 0 \ 1$$

#### **6.5 Goppa Codes**

So far codes have been constructed using the Lagrange interpolating polynomial in a rather ad hoc manner. Goppa defined a family of codes [3] in terms of the Lagrange interpolating polynomial, where the coordinates of each codeword {*c*0, *c*1, *c*2,... *c*2*m*−<sup>1</sup>} with {*c*<sup>0</sup> = *x*0, *c*<sup>1</sup> = *x*1, *c*<sup>2</sup> = *x*2,... *c*2*m*−<sup>1</sup> = *x*2*m*−<sup>1</sup>} satisfy the congruence *p*(*z*) *modulo g*(*z*) = 0 where *g*(*z*) is known as the Goppa polynomial.

Goppa codes have coefficients from *GF*(2*<sup>m</sup>*) and provided *g*(*z*) has no roots which are elements of *GF*(2*<sup>m</sup>*) (which is straightforward to achieve) the Goppa codes have parameters(2*<sup>m</sup>*, *k*, 2*<sup>m</sup>* −*k*+1). These codes are MDS codes and satisfy the Singleton bound [8]. Goppa codes as binary codes, provided that *g*(*z*) has no roots which are elements of *GF*(2*<sup>m</sup>*) and has no repeated roots, have parameters (2*<sup>m</sup>*, 2*<sup>m</sup>* − *mt*, *dmin*) where *dmin* ≥ 2*t* + 1, the Goppa code bound on minimum Hamming distance. Most binary Goppa codes have equality for the bound and *t* is the number of correctable errors for hard decision, bounded distance decoding. Primitive binary BCH codes have parameters(2*<sup>m</sup>*−1, 2*<sup>m</sup>*−*mt*−1, *dmin*), where *dmin* ≥ 2*t*+1 and so binary Goppa codes usually have the advantage over binary BCH codes of an additional information bit for the same minimum Hamming distance. However, depending on the cyclotomic cosets, many cases of BCH codes can be found having either *k* > 2*<sup>m</sup>* − *mt* − 1 for a given *t*, or *dmin* > 2*t* + 1, giving BCH codes the advantage for these cases.

For a Goppa polynomial of degree *r*, there are *r* parity check equations derived from the congruence *p*(*z*) *modulo g*(*z*) = 0. Denoting *g*(*z*) by

$$\mathbf{g}(z) = \mathbf{g}\_r z^r + \mathbf{g}\_{r-1} z^{r-1} + \mathbf{g}\_{r-2} z^{r-2} + \dots + \mathbf{g}\_1 z + \mathbf{g}\_0 \tag{6.12}$$

$$\sum\_{i=0}^{2^m - 1} \frac{c\_i}{z - \alpha\_i} = 0 \quad \text{modulo } \mathbf{g}(\mathbf{z}) \tag{6.13}$$

Since (6.13) is modulo *g*(*z*) then *g*(*z*) is equivalent to 0, and we can add *g*(*z*) to the numerator. Noting that

$$\mathbf{g}(z) = (z - \alpha\_i)q\_i(z) + r\_m \tag{6.14}$$

where *rm* is the remainder, an element of *GF*(2*<sup>m</sup>*) after dividing *g*(*z*) by *z* − α*i*. Dividing each term *z* − α*<sup>i</sup>* into 1 + *g*(*z*) produces the following:

$$\frac{g(z) + 1}{z - \alpha\_i} = q\_i(z) + \frac{r\_m + 1}{z - \alpha\_i} \tag{6.15}$$

As *rm* is a scalar, we may simply pre-weight *g*(*z*) by <sup>1</sup> *rm* so that the remainder cancels with the other numerator term which is 1.

$$\frac{\frac{q(z)}{r\_m} + 1}{z - \alpha\_i} = \frac{q\_i(z)}{r\_m} + \frac{\frac{r\_m}{r\_n} + 1}{z - \alpha\_i} = \frac{q\_i(z)}{r\_m} \tag{6.16}$$

As a result of

$$\mathbf{g}(z) = (z - \alpha\_i)q\_i(z) + r\_m$$

when *z* = α*i*, *rm* = *g*(α*i*)

#### 6.5 Goppa Codes 145

Substituting for *rm* in (6.16) produces

$$\frac{\frac{g(z)}{g(\alpha\_i)} + 1}{z - \alpha\_i} = \frac{q\_i(z)}{g(\alpha\_i)}\tag{6.17}$$

Since *<sup>g</sup>*(*z*) *<sup>g</sup>*(α*i*) modulo *g*(*z*) = 0

$$\frac{1}{z - \alpha\_i} = \frac{q\_i(z)}{g(\alpha\_i)}\tag{6.18}$$

The quotient polynomial *qi*(*z*) is a polynomial of degree *r* − 1, with coefficients which are a function of α*<sup>i</sup>* and the Goppa polynomial coefficients. Denoting *qi*(*z*) as

$$q\_i(z) = q\_{i,0} + q\_{i,1}z + q\_{i,2}z^2 + q\_{i,3}z^3 + \dots + q\_{i,(r-1)}z^{r-1} \tag{6.19}$$

Since the coefficients of each power of *z* sum to zero, the *r* parity check equations are given by

$$\sum\_{i=0}^{2^n - 1} \frac{c\_i q\_{i,j}}{g(\alpha\_i)} = 0 \quad \text{for} \quad j = 0 \quad \text{to} \quad r - 1 \tag{6.20}$$

If the Goppa polynomial has any roots which are elements of *GF*(2*<sup>m</sup>*), say α*j*, then the codeword coordinate *cj* has to be permanently set to zero in order to satisfy the parity check equations. Effectively, the code length is shortened by the number of roots of *g*(*z*) which are elements of *GF*(2*<sup>m</sup>*). Usually, the Goppa polynomial is chosen to have distinct roots which are not in *GF*(2*<sup>m</sup>*).

Consider an example of a Goppa (32, 28, 5) code. There are 4 parity check symbols defined by the 4 parity check equations and the Goppa polynomial has degree 4. Choosing somewhat arbitrarily the polynomial 1+*z* +*z*<sup>4</sup> which has roots in *GF*(16) but not in *GF*(32), we determine *qi*(*z*) by dividing by *z* − α*i*.

$$q\_i(z) = z^3 + \alpha\_i z^2 + \alpha\_i^2 z + (1 + \alpha\_i^3) \tag{6.21}$$

The 4 parity check equations are

$$\sum\_{i=0}^{31} \frac{c\_i}{g(\alpha\_i)} = 0 \tag{6.22}$$

$$\sum\_{i=0}^{31} \frac{c\_i \alpha\_i}{g(\alpha\_i)} = 0 \tag{6.23}$$

$$\sum\_{i=0}^{31} \frac{c\_i \alpha\_i^2}{g(\alpha\_i)} = 0 \tag{6.24}$$

146 6 Lagrange Codes

$$\sum\_{i=0}^{31} \frac{c\_i (1 + a\_i^3)}{g(a\_i)} = 0 \tag{6.25}$$

Using Table 6.2 to evaluate the different terms for *GF*(25), the parity check matrix is

$$\mathbf{H}\_{(32,28,5)} = \begin{bmatrix} 1 \ 1 \ \alpha^{14} \ \alpha^{20} \ \alpha^{25} \ \dots \ \alpha^{10} \\ 0 \ 1 \ \alpha^{15} \ \alpha^{22} \ \alpha^{28} \ \dots \ \alpha^{9} \\ 0 \ 1 \ \alpha^{16} \ \alpha^{24} \ 1 \ \dots \ \alpha^{8} \\ 0 \ 1 \ \alpha^{17} \ \alpha^{26} \ \alpha^{3} \ \dots \ \alpha^{7} \end{bmatrix} \tag{6.26}$$

To implement the Goppa code as a binary code, the symbols in the parity check matrix are replaced with their m-bit binary column representations of each respective *GF*(2*<sup>m</sup>*) symbol. For the (32, 28, 5) Goppa code above, each of the 4 parity symbols will be represented as a 5 bit symbol from Table 6.2. The parity check matrix will now have 20 rows for the binary code. The minimum Hamming distance of the binary Goppa code is improved from *r* + 1 to 2*r* + 1, namely from 5 to 9. Correspondingly, the binary Goppa code becomes a (32, 12, 9) code with parity check matrix



$$\mathbf{H}\_{(32,12,9)} = \begin{bmatrix} 1 \ 1 \ 1 \ 0 \ 1 \ \dots \ 1 \ 1 \\ 0 \ 0 \ 0 \ 0 \ 0 \ \dots \ 0 \\ 0 \ 0 \ 1 \ 1 \ 0 \ \dots \ 0 \\ 0 \ 0 \ 1 \ 1 \ 1 \ \dots \ 0 \\ 0 \ 0 \ 1 \ 0 \ 1 \ \dots \ 1 \\ 0 \ 0 \ 1 \ 0 \ 0 \ \dots \ 1 \\ 0 \ 0 \ 1 \ 0 \ 1 \ \dots \ 1 \\ 0 \ 0 \ 1 \ 0 \ 0 \ \dots \ 1 \\ 0 \ 0 \ 1 \ 1 \ 0 \ \dots \ 1 \\ 0 \ 0 \ 1 \ 1 \ 0 \ \dots \ 1 \\ 0 \ 0 \ 1 \ 1 \ 0 \ \dots \ 0 \\ 0 \ 0 \ 0 \ 1 \ 0 \ \dots \ 1 \\ 0 \ 0 \ 1 \ 1 \ 0 \ \dots \ 1 \\ 0 \ 0 \ 1 \ 1 \ 0 \ \dots \ 0 \\ 0 \ 0 \ 1 \ 1 \ 0 \ \dots \ 0 \\ 0 \ 0 \ 1 \ 1 \ 0 \ \dots \ 0 \\ 0 \ 0 \ 1 \ 1 \ 0 \ \dots \ 0 \\ 0 \ 0 \ 1 \ 0 \ \dots \ 1 \\ 0 \ 0 \ 0 \ 1 \ 0 \ \dots \ 1 \\ 0 \ 0 \ 0 \ 1 \ 0 \ \dots \ 1 \\ 0 \ 0 \ 0 \ 1 \ 0 \ \dots \ 1 \\ 0 \ 0 \ 0 \ 1 \ 0 \ \dots \ 1 \\ 0 \ 0 \ 1 \ 1 \ 0 \ \dots \ 1 \end{bmatrix}$$

#### **6.6 BCH Codes as Goppa Codes**

Surprisingly, the family of Goppa codes includes as a subset the family of BCH codes with codeword coefficients from *GF*(2*<sup>m</sup>*) and parameters (2*<sup>m</sup>* −1, 2*<sup>m</sup>* −1−*t*, *t* +1). As binary codes, using codeword coefficients {0, 1}, the BCH codes have parameters (2*<sup>m</sup>* − 1, 2*<sup>m</sup>* − 1 − *mt*, 2*t* + 1).

For a nonbinary BCH code to correspond to a Goppa code, the Goppa polynomial, *g*(*z*), is given by

$$\mathbf{g}(\mathbf{z}) = \mathbf{z}^t \tag{6.28}$$

There are *t* parity check equations relating to the codeword coordinates {*c*0, *c*1, *c*2,..., *c*2*m*−<sup>2</sup>} and these are given by

$$\sum\_{i=0}^{2^n - 2} \frac{c\_i}{z - \alpha^i} = 0 \quad \text{modulo } z^t \tag{6.29}$$

Dividing 1 by *z* − α*<sup>i</sup>* starting with α*<sup>i</sup>* produces

$$\frac{1}{z - \alpha^i} = \alpha^{-i} + \alpha^{-2i}z + \alpha^{-3i}z^2 + \alpha^{-3i}z^3 + \dots + \alpha^{-it}z^{t-1} + \frac{\alpha^{-(t+1)i}z^t}{z - \alpha^i} \quad (6.30)$$

As α−(*t*+1)*<sup>i</sup> z<sup>t</sup>* modulo *z<sup>t</sup>* = 0, the *t* parity check equations are given by

$$\sum\_{i=0}^{2^n-2} c\_i(\alpha^{-i} + \alpha^{-2i}z + \alpha^{-3i}z^2 + \alpha^{-4i}z^3 + \dots + \alpha^{-it}z'^{-1}) = 0\tag{6.31}$$

Every coefficient of *z*<sup>0</sup> through to *z<sup>t</sup>*−<sup>1</sup> is equated to zero, producing *t* parity check equations. The corresponding parity check matrix is

$$\mathbf{H}\_{(2^{m}-1,2^{m}-t,t+1)} = \begin{bmatrix} 1 \ \alpha^{-1} \ \alpha^{-2} \ \alpha^{-3} \ \alpha^{-4} \ \dots \ \alpha^{-(2^{m}-2)} \\ 1 \ \alpha^{-2} \ \alpha^{-4} \ \alpha^{-6} \ \alpha^{-8} \ \dots \ \alpha^{-2(2^{m}-2)} \\ 1 \ \alpha^{-3} \ \alpha^{-6} \ \alpha^{-9} \ \alpha^{-12} \ \dots \ \alpha^{-3(2^{m}-2)} \\ \ \cdots \ \cdots \ \cdots \ \cdots \ \cdots \ \cdots \\ 1 \ \alpha^{-t} \ \alpha^{-2t} \ \alpha^{-3t} \ \alpha^{-4t} \ \dots \ \alpha^{-t(2^{m}-2)} \end{bmatrix} \tag{6.32}$$

To obtain the binary BCH code, as before, the *GF*(2*<sup>m</sup>*) symbols are replaced with their m-bit binary column representations for each corresponding *GF*(2*<sup>m</sup>*) value for each symbol. As a result, only half of the parity check equations are independent and the dependent equations may be deleted. To keep the same number of independent parity check equations as before, the degree of the Goppa polynomial is doubled. The Goppa polynomial is now given by

$$g(z) = z^{2\iota} \tag{6.33}$$

The parity check matrix for the binary Goppa BCH code is

$$\mathbf{H}\_{(2^{m}-1,2^{m}-\text{mt},2t+1)} = \begin{bmatrix} 1 & \alpha^{-1} & \alpha^{-2} & \alpha^{-3} & \alpha^{-4} & \dots & \alpha^{-(2^{m}-2)} \\ 1 & \alpha^{-3} & \alpha^{-6} & \alpha^{-9} & \alpha^{-12} & \dots & \alpha^{-3(2^{m}-2)} \\ 1 & \alpha^{-5} & \alpha^{-10} & \alpha^{-15} & \alpha^{-20} & \dots & \alpha^{-5(2^{m}-2)} \\ \vdots & \vdots & \cdots & \vdots & \cdots & \cdots \\ 1 & \alpha^{-2t-1} & \alpha^{-2(2t-1)} & \alpha^{-3(2t-1)} & \alpha^{-4(2t-1)} & \dots & \alpha^{-(2t-1)(2^{m}-2)} \end{bmatrix}\_{\text{i.d.}}$$

For binary codes, any parity check equation may be squared and the resulting parity check equation will still be satisfied. As a consequence, only one parity check equation is needed for each representative from each respective cyclotomic coset. This is clearer with an example.

The cyclotomic cosets of 31, expressed as negative integers for convenience, are as follows

$$\begin{array}{l} C\_0 = & \{0\} \\ C\_{-1} = & \{-1, -2, -4, -8, -16\} \\ C\_{-3} = & \{-3, -6, -12, -24, -17\} \\ C\_{-5} = & \{-5, -10, -20, -9, -18\} \\ C\_{-7} = & \{-7, -14, -28, -25, -19\} \\ C\_{-11} = & \{-11, -22, -13, -26, -21\} \\ C\_{-15} = & \{-15, -30, -29, -27, -23\} \end{array}$$

To construct the *GF*(32) nonbinary (31, 27) BCH code, the Goppa polynomial is *g*(*z*) = *z*<sup>4</sup> and there are 4 parity check equations with parity check matrix:

$$\mathbf{H}\_{(31,27,5)} = \begin{bmatrix} 1 \,\alpha^{-1} \,\alpha^{-2} \,\,\alpha^{-3} \,\,\alpha^{-4} \,\,\dots \,\,\alpha^{-30} \\ 1 \,\,\alpha^{-2} \,\,\alpha^{-4} \,\,\alpha^{-6} \,\,\alpha^{-8} \,\,\dots \,\,\alpha^{-29} \\ 1 \,\,\alpha^{-3} \,\,\alpha^{-6} \,\,\alpha^{-9} \,\,\alpha^{-12} \,\,\dots \,\,\alpha^{-28} \\ 1 \,\,\alpha^{-4} \,\,\alpha^{-8} \,\,\alpha^{-12} \,\,\alpha^{-16} \,\,\dots \,\,\alpha^{-27} \end{bmatrix} \tag{6.34}$$

As a binary code with binary codeword coefficients, the parity check matrix has only two independent rows. To construct the binary parity check matrix, each *GF*(32) symbol is replaced with its 5-bit column vector so that each parity symbol will require 5 rows of the binary parity check matrix. The code becomes a (31, 21, 5) binary code. The parity check matrix for the binary code after removing the dependent rows is given by

$$\mathbf{H}\_{(31,21,\mathbf{5})} = \begin{bmatrix} 1 \,\alpha^{-1} \,\alpha^{-2} \,\alpha^{-3} \,\alpha^{-4} \,\dots \,\alpha^{-30} \\\ 1 \,\alpha^{-3} \,\alpha^{-6} \,\alpha^{-9} \,\alpha^{-12} \,\dots \,\alpha^{-28} \end{bmatrix} \tag{6.35}$$

To maintain 4 independent parity check equations for the binary code, the Goppa polynomial is doubled in degree to become *g*(*z*) = *z*8. Replacing each *GF*(32) symbol with its 5-bit column vector will produce a (31, 11) binary code. The parity check matrix for the binary code is given by:

$$\mathbf{H}\_{(\mathbf{31},\mathbf{11},\mathbf{9})} = \begin{bmatrix} 1 \,\alpha^{-1} \,\alpha^{-2} \,\,\alpha^{-3} \,\,\alpha^{-4} \,\,\dots \,\,\alpha^{-30} \\ 1 \,\,\alpha^{-3} \,\,\alpha^{-6} \,\,\alpha^{-9} \,\,\alpha^{-12} \,\,\dots \,\,\alpha^{-28} \\ 1 \,\,\alpha^{-5} \,\,\alpha^{-10} \,\,\alpha^{-15} \,\,\alpha^{-20} \,\,\dots \,\,\alpha^{-26} \\ 1 \,\,\alpha^{-7} \,\,\alpha^{-14} \,\,\alpha^{-21} \,\,\alpha^{-28} \,\,\dots \,\,\alpha^{-24} \end{bmatrix} \tag{6.36}$$

Looking at the cyclotomic cosets for 31, it will be noticed that α−<sup>9</sup> is in the same coset as α−5, and for codewords with binary coefficients, we may use the Goppa polynomial *g*(*z*) = *z*<sup>10</sup> with the corresponding parity check matrix

**H**(**31**, **<sup>11</sup>**, **<sup>11</sup>**) = ⎡ ⎢ ⎢ ⎣ 1 α−<sup>1</sup> α−<sup>2</sup> α−<sup>3</sup> α−<sup>4</sup> α−<sup>5</sup> α−<sup>6</sup> ... α−<sup>30</sup> 1 α−<sup>3</sup> α−<sup>6</sup> α−<sup>9</sup> α−<sup>12</sup> α−<sup>15</sup> α−<sup>18</sup> ... α−<sup>28</sup> 1 α−<sup>7</sup> α−<sup>14</sup> α−<sup>21</sup> α−<sup>28</sup> α−<sup>4</sup> α−<sup>11</sup> ... α−<sup>24</sup> 1 α−<sup>9</sup> α−<sup>18</sup> α−<sup>27</sup> α−<sup>5</sup> α−<sup>14</sup> α−<sup>23</sup> ... α−<sup>22</sup> ⎤ ⎥ ⎥ ⎦ (6.37)

Alternatively, we may use Goppa polynomial *g*(*z*) = *z*<sup>8</sup> with parity check matrix given by (6.36). The result is the same code. From this analysis we can see why the *dmin* of this BCH code is greater by 2 than the BCH code bound because the degree of the Goppa polynomial is 10.

To find other exceptional BCH codes we need to look at the cyclotomic cosets to find similar cases where a row of the parity check matrix is equivalent to a higher degree row. Consider the construction of the (31, 6, 2*t* + 1) BCH code which will have 5 parity check equations. From the cyclotomic cosets for 31, it will be noticed that α−<sup>13</sup> is in the same coset as α−11, and so we may use the Goppa polynomial *g*(*z*) = *z*<sup>14</sup> and obtain a (31, 6, 15) binary BCH code. The BCH bound indicates a minimum Hamming distance of 11. Another example is evident from the cyclotomic cosets of 127 where α−<sup>17</sup> is in the same coset as α−9. Setting the Goppa polynomial *g*(*z*) = *z*<sup>30</sup> produces the (127, 71, 19) BCH code, whilst the BCH bound indicates a minimum Hamming distance of 17.

To see the details in the construction of the parity check matrix for the binary BCH code, we will consider the (31, 11, 11) code with parity check matrix given by matrix (6.37). Each *GF*(32) symbol is replaced with the binary representation given by Table 6.2, as a 5-bit column vector, where α is a primitive root of the polynomial 1 + *x*<sup>2</sup> + *x*5.

The binary parity check matrix that is obtained is given by matrix (6.38).

**H**(**31**, **<sup>11</sup>**, **<sup>11</sup>**) = ⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ 1010111 ... 0 0101110 ... 1 0001010 ... 0 0010101 ... 0 0101011 ... 0 1011010 ... 0 0100110 ... 0 0101101 ... 0 0010011 ... 1 0111011 ... 0 1101100 ... 1 0101111 ... 0 0100100 ... 1 0011010 ... 0 0111000 ... 0 1100111 ... 0 0000110 ... 1 0110101 ... 0 0010001 ... 1 0111110 ... 1 ⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ (6.38)

Evaluating the minimum Hamming distance of this code confirms that it is 11, an increase of 2 over the BCH bound for the minimum Hamming distance.

#### **6.7 Extended BCH Codes as Goppa Codes**

In a short paper in 1971 [4], Goppa showed how a binary Goppa code could be constructed with parameters (2*<sup>m</sup>* + (*m* − 1)*t*, 2*<sup>m</sup>* − *t*, 2*t* + 1). Each parity check symbol, *m* bits long has a Forney concatenation [2], i.e. an overall parity bit on each symbol. In a completely novel approach by Goppa, each parity symbol, apart from 1 bit in each symbol, is external to the code as if these are additional parity symbols. These symbols are also independent of each other extending the length of the code and, importantly, increasing the *dmin* of the code. Sugiyama et al. [9, 10] described a construction technique mixing the standard Goppa code construction with the Goppa external parity check construction. We give below a simpler construction method applicable to BCH codes and to more general Goppa codes.

Consider a binary BCH code constructed as a Goppa code with Goppa polynomial *g*(*z*) = *z*2*<sup>t</sup>* but extended by including an additional root α0, an element of *GF*(2*<sup>m</sup>*). The Goppa polynomial is now *g*(*z*) = (*z*2*t*+<sup>1</sup> + α0*z*2*<sup>t</sup>* ). The parity check equations are given by

$$\sum\_{i=0}^{2^n - 2} \frac{c\_i}{z - \alpha^i} = 0 \quad \text{modulo } \mathbf{g}(\mathbf{z}) \quad \alpha^i \neq \alpha\_0 \tag{6.39}$$

Substituting for *rm* and *q*(*z*), as in Sect. 6.5

$$\frac{1}{z - \alpha^i} \mod \mathbf{g}(z) = \frac{\mathbf{g}(z) + \mathbf{g}(\alpha^i)}{\mathbf{g}(\alpha^i)(z - \alpha^i)}\tag{6.40}$$

For the extended binary BCH code with Goppa polynomial *g*(*z*) = (*z*2*t*+<sup>1</sup> + α*z*2*<sup>t</sup>* ) the parity check equations are given by

<sup>2</sup>*m*−<sup>2</sup> *i*=1 *ci <sup>z</sup>*−α*<sup>i</sup>* <sup>=</sup> <sup>2</sup>*m*−<sup>2</sup> *<sup>i</sup>*=<sup>1</sup> *ci <sup>z</sup>*2*<sup>t</sup>* α*i*2*t*(α*<sup>i</sup>* <sup>+</sup>α0) <sup>+</sup> *<sup>z</sup>*2*t*−<sup>1</sup> <sup>α</sup>*i*2*<sup>t</sup>* <sup>+</sup> *<sup>z</sup>*2*t*−<sup>2</sup> <sup>α</sup>*i*(2*t*−1) <sup>+</sup> *<sup>z</sup>*2*t*−<sup>3</sup> <sup>α</sup>*i*(2*t*−2) +···+ <sup>1</sup> α*i* = 0 (6.41)

Equating each coefficient of powers of *z* to zero and using only the independent parity check equations (as it is a binary code) produces *t* + 1 independent parity check equations with parity check matrix

$$\mathbf{H}\_{(2^{n}-2,2^{n}-2-\text{mt}-\text{m})} = \begin{bmatrix} \alpha^{-1} & \alpha^{-2} & \alpha^{-3} & \dots & \alpha^{-(2^{n}-2)} \\ \alpha^{-3} & \alpha^{-6} & \alpha^{-9} & \dots & \alpha^{-3(2^{n}-2)} \\ \alpha^{-5} & \alpha^{-10} & \alpha^{-15} & \dots & \alpha^{-5(2^{n}-2)} \\ \dots & \dots & \dots & \dots & \dots \\ \alpha^{-2+1} & \alpha^{-2(2t-1)} & \alpha^{-3(2t-1)} & \dots & \alpha^{-(2t-1)(2^{n}-2)} \\ \frac{\alpha^{-2}}{\cdot+a\_{0}} & \frac{\alpha^{-4t}}{\cdot a^{2}+a\_{0}} & \frac{\alpha^{-6t}}{\alpha^{3}+a\_{0}} & \dots & \frac{\alpha^{-2(2^{n}-2)}}{\alpha^{2^{n}-2}+a\_{0}} \end{bmatrix} \tag{6.42}$$

The last row may be simplified by noting that

$$\frac{1 + \alpha\_0^{-2t}\alpha^{2t}}{(\alpha\_0 + \alpha)\alpha^{2t}} = \frac{\alpha\_0^{-1}}{\alpha^{2t - 1}} + \frac{\alpha\_0^{-2}}{\alpha^{2t - 2}} + \frac{\alpha\_0^{-3}}{\alpha^{2t - 3}} + \dots + \frac{\alpha\_0^{-2t + 1}}{\alpha} \tag{6.43}$$

Rearranging produces

$$\frac{1}{(a\_0+a)a^{2t}} = \frac{\alpha\_0^{-2t}\alpha^{2t}}{(a\_0+a)a^{2t}} + \frac{\alpha\_0^{-1}}{a^{2t-1}} + \frac{\alpha\_0^{-2}}{a^{2t-2}} + \frac{\alpha\_0^{-3}}{a^{2t-3}} + \dots + \frac{\alpha\_0^{-2t+1}}{a} \quad (6.44)$$

and

$$\frac{\alpha^{-2t}}{(\alpha\_0 + \alpha)} = \frac{\alpha\_0^{-2t}}{(\alpha\_0 + \alpha)} + \frac{\alpha\_0^{-1}}{\alpha^{2t-1}} + \frac{\alpha\_0^{-2}}{\alpha^{2t-2}} + \frac{\alpha\_0^{-3}}{\alpha^{2t-3}} + \dots + \frac{\alpha\_0^{-2t+1}}{\alpha} \tag{6.45}$$

The point here is because of the above equality, the last parity check equation in (6.42) may be replaced with a simpler equation to produce the same Cauchy style parity check given by Goppa in his 1971 paper [4]. The parity check matrix becomes

$$\mathbf{H}\_{(2^{n}-2,2^{n}-2-\text{mt}-\text{m})} = \begin{bmatrix} \alpha^{-1} & \alpha^{-2} & \alpha^{-3} & \dots & \alpha^{-(2^{n}-2)} \\ \alpha^{-3} & \alpha^{-6} & \alpha^{-9} & \dots & \alpha^{-3(2^{n}-2)} \\ \alpha^{-3} & \alpha^{-10} & \alpha^{-15} & \dots & \alpha^{-5(2^{n}-2)} \\ \dots & \dots & \dots & \dots & \dots \\ \alpha^{-2+1} & \alpha^{-2(2t-1)} & \alpha^{-3(2t-1)} & \dots & \alpha^{-(2t-1)(2^{n}-2)} \\ \frac{1}{\alpha+a\_{0}} & \frac{1}{\alpha^{2}+a\_{0}} & \frac{1}{\alpha^{3}+a\_{0}} & \dots & \frac{1}{\alpha^{2^{0}-2}+a\_{0}} \end{bmatrix} \tag{6.46}$$

The justification for this is that from (6.45), the last row of (6.42) is equal to a scalar weighted linear combination of the rows of the parity check matrix(6.46), so that these rows will produce the same code as the parity check matrix (6.42). By induction, other roots of *GF*(2*<sup>m</sup>*) may be used to produce similar parity check equations to increase the distance of the code producing parity check matrices of the form:

$$\mathbf{H} = \begin{bmatrix} \alpha^{-1} & \alpha^{-2} & \alpha^{-3} & \alpha^{-4} & \dots & \alpha^{-(2^n - 2)} \\ \alpha^{-3} & \alpha^{-6} & \alpha^{-9} & \alpha^{-12} & \dots & \alpha^{-3(2^n - 2)} \\ \alpha^{-5} & \alpha^{-10} & \alpha^{-15} & \alpha^{-20} & \dots & \alpha^{-5(2^n - 2)} \\ \vdots & \dots & \dots & \dots & \dots & \dots \\ \alpha^{-2t+1} & \alpha^{-2(2t-1)} & \alpha^{-3(2t-1)} & \alpha^{-4(2t-1)} & \dots & \alpha^{-(2t-1)(2^n - 2)} \\ \hline \frac{1}{a+a\_0} & \frac{1}{a^2+a\_0} & \frac{1}{a^3+a\_0} & \frac{1}{a^4+a\_0} & \dots & \frac{1}{a^{2^{n-2}-2}+a\_0} \\ \frac{1}{a+a\_1} & \frac{1}{a^2+a\_1} & \frac{1}{a^3+a\_1} & \frac{1}{a^4+a\_1} & \dots & \frac{1}{a^{2^{n-2}-2}+a\_1} \\ \dots & \dots & \dots & \dots & \dots & \dots \\ \frac{1}{a+a\_{0-1}} & \frac{1}{a^2+a\_{0-1}} & \frac{1}{a^3+a\_{0-1}} & \frac{1}{a^4+a\_{0-1}} & \dots & \frac{1}{a^{2^{n-2}-2}+a\_{0-1}} \end{bmatrix} \tag{6.47}$$

The parity symbols given by the last *s*<sup>0</sup> rows of this matrix are in the Cauchy matrix style [7] and will necessarily reduce the length of the code for each root of the Goppa polynomial which is an element of *GF*(2*<sup>m</sup>*). However, Goppa was the first to show [4] that the parity symbols may be optionally placed external to the code, without decreasing the length of the code. For binary codes the length of the code increases as will be shown below. Accordingly, with external parity symbols, the parity check matrix becomes

**H** = ⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ α−<sup>1</sup> α−<sup>2</sup> α−<sup>3</sup> α−<sup>4</sup> ... α−(2*m*−2) 0000 α−<sup>3</sup> α−<sup>6</sup> α−<sup>9</sup> α−<sup>12</sup> ... α−3(2*m*−2) 0000 α−<sup>5</sup> α−<sup>10</sup> α−<sup>15</sup> α−<sup>20</sup> ... α−5(2*m*−2) 0000 ... ... ... ... ... ... ............ α−2*t*+<sup>1</sup> α−2(2*t*−1) α−3(2*t*−1) α−4(2*t*−1) ... α−(2*t*−1)(2*m*−2) 0000 1 α+α<sup>0</sup> 1 α2+α<sup>0</sup> 1 α3+α<sup>0</sup> 1 <sup>α</sup>4+α<sup>0</sup> ... <sup>1</sup> <sup>α</sup>2*m*−2+α<sup>0</sup> <sup>1000</sup> 1 α+α<sup>1</sup> 1 α2+α<sup>1</sup> 1 α3+α<sup>1</sup> 1 <sup>α</sup>4+α<sup>1</sup> ... <sup>1</sup> <sup>α</sup>2*m*−2+α<sup>1</sup> <sup>0100</sup> ... ... ... ... ... ... ............ 1 α+α*s*0−<sup>1</sup> 1 α2+α*s*0−<sup>1</sup> 1 α3+α*s*0−<sup>1</sup> 1 <sup>α</sup>4+α*s*0−<sup>1</sup> ... <sup>1</sup> <sup>α</sup>2*m*−2+α*s*0−<sup>1</sup> <sup>0001</sup> ⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ (6.48)

As an example of the procedure, consider the (31, 11, 11) binary BCH code described in Sect. 6.6. We shall add one external parity symbol to this code according to the parity check matrix in (6.48) and eventually produce a (36, 10, 13) binary BCH code. Arbitrarily, we shall choose α<sup>0</sup> = 1. This means that the first column of the parity check matrix for the (31, 11, 11) code given in (6.38) is deleted and there is one additional parity check row. The parity check matrix for this (35, 10, 12) extended BCH code is given below. Note we will add later an additional parity bit in a Forney concatenation of the external parity symbol to produce the (36, 10, 13) code as a last step.

$$\mathbf{H}\_{(35,10,12)} = \begin{bmatrix} \alpha^{-1} \ \alpha^{-2} \ \alpha^{-3} \ \alpha^{-4} \ \alpha^{-5} \ \alpha^{-6} \ \dots \ \alpha^{-30} \ 0\\\alpha^{-3} \ \alpha^{-6} \ \alpha^{-9} \ \alpha^{-12} \ \alpha^{-15} \ \alpha^{-18} \ \dots \ \alpha^{-28} \ 0\\\alpha^{-5} \ \alpha^{-10} \ \alpha^{-15} \ \alpha^{-20} \ \alpha^{-25} \ \alpha^{-30} \ \dots \ \alpha^{-26} \ 0\\\alpha^{-9} \ \alpha^{-18} \ \alpha^{-27} \ \alpha^{-5} \ \alpha^{-14} \ \alpha^{-23} \ \dots \ \alpha^{-22} \ 0\\\ \frac{1}{\alpha+1} \ \frac{1}{\alpha^2+1} \ \frac{1}{\alpha^3+1} \ \frac{1}{\alpha^4+1} \ \frac{1}{\alpha^5+1} \ \frac{1}{\alpha^6+1} \ \dots \ \frac{1}{\alpha^{20}+1} \ 1\end{bmatrix} \tag{6.49}$$

Evaluating the last row by carrying out the additions, and inversions, referring to the table of *GF*(32) symbols in Table 6.2 produces the resulting parity check matrix

$$\mathbf{H}\_{(\mathbf{35},10,12)} = \begin{bmatrix} \alpha^{-1} & \alpha^{-2} & \alpha^{-3} & \alpha^{-4} & \alpha^{-5} & \dots \alpha^{-30} & 0\\ \alpha^{-3} & \alpha^{-6} & \alpha^{-12} & \alpha^{-15} & \alpha^{-18} & \dots \alpha^{-28} & 0\\ \alpha^{-5} & \alpha^{-10} & \alpha^{-20} & \alpha^{-25} & \alpha^{-30} & \dots \alpha^{-26} & 0\\ \alpha^{-9} & \alpha^{-18} & \alpha^{-5} & \alpha^{-14} & \alpha^{-23} & \dots \alpha^{-22} & 0\\ \alpha^{-13} & \alpha^{-26} & \alpha^{-2} & \alpha^{-21} & \alpha^{-29} & \alpha^{-4} & \dots \alpha^{-14} & 1 \end{bmatrix} \tag{6.50}$$

The next step is to determine the binary parity check matrix for the code by replacing each *GF*(32)symbol by its corresponding 5-bit representation using Table 6.2, but as a 5-bit column vector. Also we will add an additional parity check row to implement the Forney concatenation of the external parity symbol. The resulting binary parity check matrix in (6.51) is obtained. Evaluating the minimum Hamming distance of this code using one of the methods described in Chap. 5 verifies that it is indeed 13.

Adding the external parity symbol has increased the minimum Hamming distance by 2, but at the cost of one data symbol. Instead of choosing α<sup>0</sup> = 1, a good idea is to choose α<sup>0</sup> = 0, since 0 is a multiple root of the Goppa polynomial *g*(*z*) = *z*<sup>10</sup> which caused the BCH code to be shortened from length 2*<sup>m</sup>* to 2*<sup>m</sup>* − 1 in the first place. (The length of a Goppa code with Goppa polynomial *g*(*z*) having no roots in *GF*(2*<sup>m</sup>*) is 2*<sup>m</sup>*). The resulting parity check matrix is given in (6.52).


#### 6.7 Extended BCH Codes as Goppa Codes 155

**H**(**36**, **<sup>11</sup>**) = ⎡ ⎢ ⎢ ⎢ ⎢ ⎣ 1 α−<sup>1</sup> α−<sup>2</sup> α−<sup>3</sup> α−<sup>4</sup> α−<sup>5</sup> α−<sup>6</sup> ... α−<sup>30</sup> 0 1 α−<sup>3</sup> α−<sup>6</sup> α−<sup>9</sup> α−<sup>12</sup> α−<sup>15</sup> α−<sup>18</sup> ... α−<sup>28</sup> 0 1 α−<sup>5</sup> α−<sup>10</sup> α−<sup>15</sup> α−<sup>20</sup> α−<sup>25</sup> α−<sup>30</sup> ... α−<sup>26</sup> 0 1 α−<sup>9</sup> α−<sup>18</sup> α−<sup>27</sup> α−<sup>5</sup> α−<sup>14</sup> α−<sup>23</sup> ... α−<sup>22</sup> 0 1 α−<sup>1</sup> α−<sup>2</sup> α−<sup>3</sup> α−<sup>4</sup> α−<sup>5</sup> α−<sup>6</sup> ... α−<sup>30</sup> 1 ⎤ ⎥ ⎥ ⎥ ⎥ ⎦ (6.52)

The problem with this is that the minimum Hamming distance is still 11 because the last row of the parity check matrix is the same as the first row, apart from the external parity symbol because 0 is a root of the Goppa polynomial. The solution is to increase the degree of the Goppa polynomial but still retain the external parity symbol. Referring to the cyclotomic cosets of 31, see (6.35), we should use *g*(*z*) = *z*<sup>12</sup> to produce the parity check matrix

$$\mathbf{H}\_{(36,11)} = \begin{bmatrix} 1 \ \alpha^{-1} \ \alpha^{-2} \ \alpha^{-3} \ \alpha^{-4} \ \alpha^{-5} \ \alpha^{-6} \ \dots \ \alpha^{-30} \ 0\\ 1 \ \alpha^{-3} \ \alpha^{-6} \ \alpha^{-9} \ \alpha^{-12} \ \alpha^{-15} \ \alpha^{-18} \ \dots \ \alpha^{-28} \ 0\\ 1 \ \alpha^{-5} \ \alpha^{-10} \ \alpha^{-15} \ \alpha^{-20} \ \alpha^{-25} \ \alpha^{-30} \ \dots \ \alpha^{-26} \ 0\\ 1 \ \alpha^{-9} \ \alpha^{-18} \ \alpha^{-27} \ \alpha^{-3} \ \alpha^{-14} \ \alpha^{-23} \ \dots \ \alpha^{-22} \ 0\\ 1 \ \alpha^{-11} \ \alpha^{-22} \ \alpha^{-2} \ \alpha^{-13} \ \alpha^{-24} \ \alpha^{-4} \ \dots \ \alpha^{-20} \ 1 \end{bmatrix} \tag{6.53}$$

As before, the next step is to determine the binary parity check matrix for the code from this matrix by replacing each *GF*(32) symbol by its corresponding 5 bit representation using Table 6.2 as a 5 bit column vector. Also we will add an additional parity check row to implement the Forney concatenation of the external parity symbol. The resulting binary parity check matrix is obtained

$$
\begin{bmatrix}
1 & 0 & 1 & 0 & 1 & 1 & \dots & 0 & 0 & 0 & 0 & 0 & 0 \\
0 & 1 & 0 & 1 & 1 & 0 & \dots & 1 & 0 & 0 & 0 & 0 & 0 \\
0 & 0 & 1 & 0 & 1 & 0 & \dots & 0 & 0 & 0 & 0 & 0 \\
0 & 0 & 1 & 0 & 1 & \dots & 0 & 0 & 0 & 0 & 0 \\
0 & 1 & 0 & 1 & 1 & \dots & 0 & 0 & 0 & 0 & 0 \\
\\1 & 0 & 1 & 0 & 1 & 0 & \dots & 0 & 0 & 0 & 0 & 0 \\
0 & 1 & 0 & 1 & 0 & 1 & \dots & 0 & 0 & 0 & 0 \\
0 & 0 & 1 & 0 & 1 & \dots & 1 & 0 & 0 & 0 & 0 \\
0 & 1 & 1 & 0 & 1 & \dots & 0 & 0 & 0 & 0 & 0 \\
\\1 & 1 & 0 & 1 & 0 & 1 & \dots & 0 & 0 & 0 & 0 & 0 \\
0 & 1 & 0 & 1 & 1 & \dots & 0 & 0 & 0 & 0 & 0 \\
0 & 1 & 0 & 1 & 0 & \dots & 1 & 0 & 0 & 0 & 0 \\
0 & 1 & 0 & 1 & 0 & \dots & 0 & 0 & 0 & 0 & 0 \\
0 & 1 & 1 & 0 & 0 & \dots & 0 & 0 & 0 & 0 & 0 \\
\end{bmatrix}
$$

**H**(**37**, **<sup>11</sup>**, **<sup>13</sup>**) = ⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ 1100111 ... 0000000 0000110 ... 1000000 0110101 ... 0000000 0010001 ... 1000000 0111110 ... 1000000 1001101 ... 1100000 0010101 ... 1010000 0100010 ... 1001000 0111001 ... 0000100 0010010 ... 0000010 0000000 ... 0111111 ⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ (6.54)

Weight spectrum analysis of this code confirms that the *dmin* is indeed 13. One or more Cauchy style parity check equations may be added to this code to increase the *dmin* of the code. For example, with one more parity check equation again with the choice of α<sup>0</sup> = 1, the parity check matrix for the (42,10) code is

$$\mathbf{H}\_{(42,10)} = \begin{bmatrix} \alpha^{-1} \begin{array}{cccc} \alpha^{-1} & \alpha^{-2} & \alpha^{-3} & \alpha^{-4} & \alpha^{-5} & \alpha^{-6} & \dots \alpha^{-30} & 0 \ 0\\ \alpha^{-3} & \alpha^{-6} & \alpha^{-9} & \alpha^{-12} \alpha^{-15} \alpha^{-18} & \dots \alpha^{-28} & 0 \ 0\\ \alpha^{-5} & \alpha^{-10} \alpha^{-15} & \alpha^{-20} \alpha^{-25} \alpha^{-30} & \dots \alpha^{-26} & 0 \ 0\\ \alpha^{-9} & \alpha^{-18} \alpha^{-27} & \alpha^{-5} & \alpha^{-14} \alpha^{-23} & \dots \alpha^{-22} & 0 \ 0\\ \alpha^{-11} & \alpha^{-22} & \alpha^{-13} & \alpha^{-24} & \alpha^{-4} & \dots \alpha^{-20} & 1 \ 0\\ \alpha^{-18} & \alpha^{-5} & \alpha^{-29} \alpha^{-10} & \alpha^{-2} & \alpha^{-27} & \dots \alpha^{-17} & 0 \ 1 \end{array} \tag{6.55}$$

Replacing each *GF*(32) symbol by its corresponding 5 bit representation using Table 6.2 as a 5-bit column vector and adding an additional parity check row to each external parity symbol produces the binary parity check matrix for the (42, 10, 15) code.

**H**(**42**, **<sup>10</sup>**, **<sup>15</sup>**) = ⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ 100111 ... 0000000000000 000110 ... 1000000000000 110101 ... 0000000000000 010001 ... 1000000000000 111110 ... 1000000000000 001101 ... 1100000000000 010101 ... 1010000000000 100010 ... 1001000000000 111001 ... 0000100000000 010010 ... 0000010000000 000000 ... 0111111000000 010010 ... 1000000100000 010000 ... 0000000010000 111000 ... 1000000001000 100110 ... 1000000000100 110101 ... 1000000000010 000000 ... 0000000111111 ⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ (6.56)

Weight spectrum analysis of this code confirms that the *dmin* is indeed 15. In this construction the information bit coordinate corresponding to α<sup>0</sup> = 1 is deleted, reducing the dimension of the code by 1. This is conventional practice when the Goppa polynomial *g*(*z*) contains a root that is in *GF*(2*<sup>m</sup>*). However, on reflection, this is not essential. Certainly, in the parity check symbol equations of the constructed code, there will be one parity check equation where the coordinate is missing, but additional parity check equations may be used to compensate for the missing coordinate(s).

Consider the (42, 10) code above, given by parity check matrix (6.55) without the deletion of the first coordinate. The parity check matrix for the (42, 11) code becomes

$$\mathbf{H}\_{(42,11)} = \begin{bmatrix} 1 \ \alpha^{-1} \ \alpha^{-2} \ \alpha^{-3} \ \alpha^{-4} \ \alpha^{-5} \ \alpha^{-6} \ \dots \ \alpha^{-30} \ 0 \ 0\\ 1 \ \alpha^{-3} \ \alpha^{-6} \ \alpha^{-9} \ \alpha^{-12} \ \alpha^{-15} \ \alpha^{-18} \ \dots \ \alpha^{-28} \ 0 \ 0\\ 1 \ \alpha^{-5} \ \alpha^{-10} \ \alpha^{-20} \ \alpha^{-20} \ \alpha^{-30} \ \dots \ \alpha^{-26} \ 0 \ 0\\ 1 \ \alpha^{-9} \ \alpha^{-18} \ \alpha^{-27} \ \alpha^{-5} \ \alpha^{-14} \ \alpha^{-23} \ \dots \ \alpha^{-22} \ 0 \ 0\\ 1 \ \alpha^{-11} \ \alpha^{-22} \ \alpha^{-2} \ \alpha^{-13} \ \alpha^{-24} \ \alpha^{-4} \ \dots \ \alpha^{-20} \ 1 \ 0\\ 0 \ \alpha^{-18} \ \alpha^{-5} \ \alpha^{-29} \ \alpha^{-10} \ \alpha^{-2} \ \alpha^{-27} \ \dots \ \alpha^{-17} \ 0 \ 1 \end{bmatrix} \tag{6.57}$$

It will be noticed that the first coordinate is not in the last parity check equation. Constructing the binary code as before by replacing each *GF*(32) symbol by its corresponding 5-bit representation using Table 6.2 as a 5-bit column vector and adding an additional parity check row to each external parity symbol produces a (42, 11, 13) binary code. There is no improvement in the *dmin* of the (42, 11, 13) binary code compared to the (37, 11, 13) binary code despite the 5 additional parity bits. However, weight spectrum analysis of the (42, 11, 13) binary code shows that there is only 1 codeword of weight 13 and only 3 codewords of weight 14. All of these low weight codewords contain the first coordinate which is not surprising. Two more parity check equations containing the first coordinate need to be added to the parity check matrix to compensate for the coordinate not being in the last equation of the parity check symbol matrix (6.57).

It turns out that the coordinate in question can always be inserted into the overall parity check equation to each external parity symbol without any loss, so that only one additional parity check equation is required for each root of *g*(*z*) that is in *GF*(2*<sup>m</sup>*).

This produces the following binary parity check matrix for the (43, 11, 15) code.

**H**(**43**, **<sup>11</sup>**, **<sup>15</sup>**) = ⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ 1010111 ... 00000000000000 0101110 ... 10000000000000 0001010 ... 00000000000000 0010101 ... 00000000000000 0101011 ... 00000000000000 1011010 ... 00000000000000 0100110 ... 00000000000000 0101101 ... 00000000000000 0010011 ... 10000000000000 0111011 ... 00000000000000 1101100 ... 10000000000000 0101111 ... 00000000000000 0100100 ... 10000000000000 0011010 ... 00000000000000 0111000 ... 00000000000000 1100111 ... 00000000000000 0000110 ... 10000000000000 0110101 ... 00000000000000 0010001 ... 10000000000000 0111110 ... 10000000000000 1001101 ... 11000000000000 0010101 ... 10100000000000 0100010 ... 10010000000000 0111001 ... 00001000000000 0010010 ... 00000100000000 0000000 ... 01111110000000 0010010 ... 10000001000000 0010000 ... 00000000100000 ⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ (6.58)


It will be noticed that the last but one row is the Forney concatenation on the last *GF*(32) symbol of parity check matrix (6.57), the overall parity check on parity bits 36–41. Bit 0 has been added to this equation. Also, the last row of the binary parity check matrix is simply a repeat of bit 0. In this way, bit 0 has been fully compensated for not being in the last row of parity check symbol matrix (6.57).

BCH codes extended in length in this way can be very competitive compared to the best known codes [5]. The most efficient extensions of BCH codes are for *g*(*z*) having only multiple roots of *z* = 0 because no additional deletions of information bits are necessary nor are compensating parity check equations necessary. However, *n* does need to be a Mersenne prime, and the maximum extension is 2 symbols with 2*m* + 2 additional, overall parity bits, increasing the *dmin* by 4. Where *n* is not a Mersenne prime the maximum extension is 1 symbol with *m* + 1 additional, overall parity bits, increasing the *dmin* by 2.

However regardless of *n* being a Mersenne prime or not, multiple symbol extensions may be carried out if *g*(*z*) has additional roots from *GF*(2*<sup>m</sup>*), increasing the *dmin* by 2 for each additional root. The additional root can also be *z* = 0.

As further examples, a (37, 11, 13) code and a (43, 11, 15) code can be constructed in this way by extending the (31, 11, 11) BCH code. Also a (135, 92, 13) code and a (143, 92, 15) code can be constructed by extending the (127, 92, 11) BCH code. A (135, 71, 21) code and a (143, 71, 23) code can be constructed by extending the (127, 71, 19) BCH code.

For more than 2 extended symbols for Mersenne primes, or more than 1 extended symbol for non-Mersenne primes, it is necessary to use mixed roots of *g*(*z*) from *GF*(2*<sup>m</sup>*) and have either deletions of information bits or compensating parity check equations or both. As examples of these code constructions there are:


the dimension by 2 and one additional, compensating parity check bit. All of these codes are best known codes [5].

#### **6.8 Binary Codes from MDS Codes**

The Goppa codes and BCH codes, which are a subset of Goppa codes, when constructed as codes with symbols from *GF*(*q*) are all MDS codes and are examples of generalised Reed–Solomon codes [7]. MDS codes are exceptional codes and there are not many construction methods for these codes. For (*n*, *k*) MDS codes the repetition code, having *k* = 1, can have any length of *n* independently of the field size *q*. For values *k* = 3 and *k* = *q*−1 and with *q* even the maximum value of *n* is *n* = *q*+2 [7]. For all other cases, the maximum value of *n* is *n* = *q* + 1 with a construction known as the doubly extended Reed–Solomon codes. The parity check matrix for a (*q* + 1, *k*) doubly extended Reed–Solomon code is

$$\mathbf{H}\_{\text{RS+}} = \begin{bmatrix} 1 & 1 & 1 & 1 & 1 & 1 & 1 & \dots & 1 & 1 & 0 \\ 1 & \alpha\_1 & \alpha\_2 & \alpha\_3 & \alpha\_4 & \alpha\_5 & \alpha\_6 & \dots \alpha\_{q-2} & 0 & 0 \\ 1 & \alpha\_1^2 & \alpha\_2^2 & \alpha\_3^2 & \alpha\_4^2 & \alpha\_5^2 & \alpha\_6^2 & \dots \alpha\_{q-2}^2 & 0 & 0 \\ 1 & \alpha\_1^3 & \alpha\_2^3 & \alpha\_3^3 & \alpha\_4^3 & \alpha\_5^3 & \alpha\_6^3 & \dots \alpha\_{q-2}^3 & 0 & 0 \\ 1 & \alpha\_1^4 & \alpha\_2^4 & \alpha\_3^4 & \alpha\_4^4 & \alpha\_5^4 & \alpha\_6^4 & \dots \alpha\_{q-2}^4 & 0 & 0 \\ & \dots & \dots & \dots & \dots & \dots & \dots & \dots & \dots & \dots \\ 1 & \alpha\_1^{q-k} & \alpha\_2^{q-k} & \alpha\_3^{q-k} & \alpha\_4^{q-k} & \alpha\_5^{q-k} & \alpha\_6^{q-k} & \dots \alpha\_{q-2}^{q-k} & 0 & 1 \end{bmatrix} \tag{6.59}$$

where the *q* elements of *GF*(*q*) are {0, 1, α1, α2, α3,...,α*<sup>q</sup>*−1}.

As the codes are MDS, the minimum Hamming distance is *q* + 2 − *k*, forming a family of (*q* + 1, *k*, *q* + 2 − *k*) codes meeting the Singleton bound [8].

The MDS codes may be used as binary codes simply by restricting the data symbols to values of {0 and 1} to produce a subfield subcode. Alternatively for *GF*(2*<sup>m</sup>*) each symbol may be replaced with a *m* × *m* binary matrix to produce the family of ((2*<sup>m</sup>* + 1)*m*, *mk*, 2*<sup>m</sup>* + 2 − *k*) of binary codes. As an example, with *m* = 4 and *k* = 12, the result is a (68, 48, 5) binary code. This is not a very competitive code because the equivalent best known code [5], the (68, 48, 8) code, has much better minimum Hamming distance.

However, using the Forney concatenation [2] on each symbol almost doubles the minimum Hamming distance with little increase in redundancy and produces the family of ((2*<sup>m</sup>* + 1)(*m* + 1), *mk*, 2(2*<sup>m</sup>* + 1 − *k*) + 1) of binary codes. With the same example values for *m* and *k* the (85, 48, 11) binary code is produced. Kasahara [6] noticed that it is sometimes possible with this code construction to add an additional information bit by adding the all 1's codeword to the generator matrix of the code. Equivalently expressed, all of the codewords may be complemented without degrading the minimum Hamming distance. It is possible to go further depending on the length of the code and the minimum Hamming distance. Since the binary parity of each symbol is always even, then if *m* + 1 is an odd number, then adding the all 1's pattern to each symbol will produce weight of at least 1 per symbol. For the (85, 48, 11) constructed binary code *m* + 1 = 5, an odd number and the number of symbols is 17. Hence, adding the all 1's pattern (i.e. 85 1's) to each codeword will produce a minimum weight of at least 17. Accordingly, a (85, 49, 11) code is produced. Adding an overall parity bit to each codeword increases the minimum Hamming distance to 12 producing a (86, 49, 12) code and shortening the code by deleting one information bit produces a (85, 48, 12) code. This is a good code because the corresponding best known code is also a (85, 48, 12) code. However, the construction method is different because the best known code is derived from the (89, 56, 11) cyclic code.

Looking at constructing binary codes from MDS codes by simply restricting the data symbols to values of {0 and 1}, consider the example of the extended Reed– Solomon code of length 16 using *GF*(2<sup>4</sup>) with 2 parity symbols. The code is the MDS (16, 14, 3) code. The parity check matrix is

$$\mathbf{H}\_{\text{(16,14)}} = \begin{bmatrix} 1 \,\alpha^1 \,\alpha^2 \,\,\alpha^3 \,\,\alpha^4 \,\,\alpha^5 \,\,\alpha^6 \,\,\alpha^7 \,\,\alpha^8 \,\,\alpha^9 \,\,\alpha^{10} \,\,\alpha^{11} \,\,\alpha^{12} \,\,\alpha^{13} \,\,\alpha^{14} \,\,\mathbf{0} \\\ 1 \,\,\alpha^3 \,\,\alpha^6 \,\,\alpha^9 \,\,\alpha^{12} \,\,\mathbf{1} \,\,\alpha^3 \,\,\alpha^6 \,\,\alpha^{12} \,\,\mathbf{1} \,\,\alpha^3 \,\,\alpha^6 \,\,\alpha^9 \,\,\alpha^{12} \,\,\mathbf{1} \end{bmatrix} \tag{6.60}$$

With binary codeword coordinates, denoted as *ci* the first parity check equation from the first row of the parity check matrix is

$$\sum\_{i=0}^{14} c\_i \alpha^i = 0 \tag{6.61}$$

Squaring both sides of this equation produces

$$\sum\_{i=0}^{14} c\_i^2 \alpha^{2i} = 0\tag{6.62}$$

As the codeword coordinates are binary, *c*<sup>2</sup> *<sup>i</sup>* = *ci* and so any codeword satisfying the equations of (6.58) satisfies all of the following equations by induction from (6.60)

**H**(**16**,**14**) = ⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ 1 α<sup>1</sup> α<sup>2</sup> α<sup>3</sup> α<sup>4</sup> α<sup>5</sup> α<sup>6</sup> α<sup>7</sup> α<sup>8</sup> α<sup>9</sup> α<sup>10</sup> α<sup>11</sup> α<sup>12</sup> α<sup>13</sup> α<sup>14</sup> 0 1 α<sup>2</sup> α<sup>4</sup> α<sup>6</sup> α<sup>8</sup> α<sup>10</sup> α<sup>12</sup> α<sup>14</sup> α<sup>1</sup> α<sup>3</sup> α<sup>5</sup> α<sup>7</sup> α<sup>9</sup> α<sup>11</sup> α<sup>13</sup> 0 1 α<sup>3</sup> α<sup>6</sup> α<sup>9</sup> α<sup>12</sup> 1 α<sup>3</sup> α<sup>6</sup> α<sup>9</sup> α<sup>12</sup> 1 α<sup>3</sup> α<sup>6</sup> α<sup>9</sup> α<sup>12</sup> 1 1 α<sup>4</sup> α<sup>8</sup> α<sup>12</sup> α<sup>1</sup> α<sup>5</sup> α<sup>9</sup> α<sup>13</sup> α<sup>2</sup> α<sup>4</sup> α<sup>10</sup> α<sup>14</sup> α<sup>3</sup> α<sup>7</sup> α<sup>11</sup> 0 1 α<sup>6</sup> α<sup>12</sup> α<sup>3</sup> α<sup>9</sup> 1 α<sup>6</sup> α<sup>12</sup> α<sup>3</sup> α<sup>9</sup> 1 α<sup>6</sup> α<sup>12</sup> α<sup>3</sup> α<sup>9</sup> 1 ... ... ... ... ... ... ... ... ... ... ... ... ... ... ⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ (6.63)

There are 4 consecutive zeros of the parent Reed–Solomon code from the first 4 rows of the parity check matrix indicating that the minimum Hamming distance may be 5 **Table 6.3** *GF*(16) extension field defined by <sup>1</sup> <sup>+</sup> <sup>α</sup><sup>1</sup> <sup>+</sup> <sup>α</sup><sup>4</sup> <sup>=</sup> <sup>0</sup>

<sup>α</sup><sup>0</sup> <sup>=</sup> <sup>1</sup> <sup>α</sup><sup>1</sup> <sup>=</sup> <sup>α</sup> <sup>α</sup><sup>2</sup> <sup>=</sup> <sup>α</sup><sup>2</sup> <sup>α</sup><sup>3</sup> <sup>=</sup> <sup>α</sup><sup>3</sup> <sup>α</sup><sup>4</sup> <sup>=</sup> <sup>1</sup> <sup>+</sup> <sup>α</sup> <sup>α</sup><sup>5</sup> <sup>=</sup> <sup>α</sup> <sup>+</sup> <sup>α</sup><sup>2</sup> <sup>α</sup><sup>6</sup> <sup>=</sup> <sup>α</sup><sup>2</sup> <sup>+</sup> <sup>α</sup><sup>3</sup> <sup>α</sup><sup>7</sup> <sup>=</sup> <sup>1</sup> <sup>+</sup> <sup>α</sup> <sup>+</sup> <sup>α</sup><sup>3</sup> <sup>α</sup><sup>8</sup> <sup>=</sup> <sup>1</sup> <sup>+</sup> <sup>α</sup><sup>2</sup> <sup>α</sup><sup>9</sup> <sup>=</sup> <sup>α</sup> <sup>+</sup> <sup>α</sup><sup>3</sup> <sup>α</sup><sup>10</sup> <sup>=</sup> <sup>1</sup> <sup>+</sup> <sup>α</sup> <sup>+</sup> <sup>α</sup><sup>2</sup> <sup>α</sup><sup>11</sup> <sup>=</sup> <sup>α</sup> <sup>+</sup> <sup>α</sup><sup>2</sup> <sup>+</sup> <sup>α</sup><sup>3</sup> <sup>α</sup><sup>12</sup> <sup>=</sup> <sup>1</sup> <sup>+</sup> <sup>α</sup> <sup>+</sup> <sup>α</sup><sup>2</sup> <sup>+</sup> <sup>α</sup><sup>3</sup> <sup>α</sup><sup>13</sup> <sup>=</sup> <sup>1</sup> <sup>+</sup> <sup>α</sup><sup>2</sup> <sup>+</sup> <sup>α</sup><sup>3</sup> <sup>α</sup><sup>14</sup> <sup>=</sup> <sup>1</sup> <sup>+</sup> <sup>α</sup><sup>3</sup>

for the binary code. However, comparing the last column of this matrix with (6.57) indicates that this column is not correct.

Constructing the binary check matrix from the parity check equations, (6.58) using Table 6.3 substituting the respective 4 bit vector for each column vector of each nonzero *GF*(16) symbol, (0 in *GF*(16) is 0000) produces the following binary check matrix

$$\mathbf{H}\_{\text{(16,8)}} = \begin{bmatrix} 1 \ 0 \ 0 \ 0 \ 1 \ 0 \ 0 \ 1 \ 1 \ 0 \ 1 \ 0 \ 1 \ 0 \ 1 \ 1 \ 1 \ 0\\0 \ 1 \ 0 \ 0 \ 1 \ 1 \ 0 \ 1 \ 0 \ 0 \ 1 \ 0 \ 1 \ 0 \ 0 \ 0 \ 0\\0 \ 0 \ 1 \ 0 \ 0 \ 1 \ 1 \ 0 \ 1 \ 0 \ 1 \ 1 \ 1 \ 1 \ 0 \ 0\\0 \ 0 \ 0 \ 1 \ 0 \ 0 \ 1 \ 1 \ 0 \ 1 \ 0 \ 1 \ 1 \ 1 \ 1 \ 0\\\\1 \ 0 \ 0 \ 0 \ 1 \ 1 \ 0 \ 0 \ 0 \ 1 \ 1 \ 0 \ 0 \ 0 \ 0 \ 1 \ 1\\0 \ 0 \ 0 \ 1 \ 1 \ 0 \ 0 \ 0 \ 1 \ 1 \ 0 \ 0 \ 0 \ 1 \ 1 \ 0\\0 \ 0 \ 1 \ 0 \ 1 \ 0 \ 0 \ 1 \ 0 \ 1 \ 0 \ 0 \ 1 \ 0 \ 1 \ 0 \ 1 \ 0\\0 \ 1 \ 1 \ 1 \ 1 \ 0 \ 1 \ 1 \ 1 \ 0 \ 1 \ 1 \ 1 \ 1 \ 0 \end{bmatrix} \tag{6.64}$$

Weight spectrum analysis indicates the minimum Hamming distance of this code is 4 due to a single codeword of weight 4,{0, 5, 10, 15}. Deleting the last column of the parity check matrix produces a (15, 8, 5) code. Another approach is needed to go from the MDS code to a binary code without incurring a loss in the minimum Hamming distance.

It is necessary to use the generalised Reed–Solomon MDS code. Here, each column of the parity check matrix is multiplied by a nonzero element of the *GF*(2*<sup>m</sup>*) field defined as {μ0, μ1, μ2, μ3,...,μ2*<sup>m</sup>* }. It is not necessary for these to be distinct, just to have a multiplicative inverse. The parity check matrix for the (*q* + 1, *k*) generalised Reed–Solomon MDS code is

**HGRS**<sup>+</sup> = ⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ ν<sup>0</sup> ν<sup>1</sup> ν<sup>2</sup> ν<sup>3</sup> ν<sup>4</sup> ν<sup>5</sup> ... ν*<sup>q</sup>*−<sup>2</sup> ν*<sup>q</sup>*−<sup>1</sup> 0 ν<sup>0</sup> ν1α<sup>1</sup> ν2α<sup>2</sup> ν3α<sup>3</sup> ν4α<sup>4</sup> ν5α<sup>5</sup> ... ν*<sup>q</sup>*−2α*<sup>q</sup>*−<sup>2</sup> 0 0 ν<sup>0</sup> ν1α<sup>2</sup> <sup>1</sup> ν2α<sup>2</sup> <sup>2</sup> ν3α<sup>2</sup> <sup>3</sup> ν4α<sup>2</sup> <sup>4</sup> ν5α<sup>2</sup> <sup>5</sup> ... ν*<sup>q</sup>*−2α<sup>2</sup> *<sup>q</sup>*−<sup>2</sup> 0 0 ν<sup>0</sup> ν1α<sup>3</sup> <sup>1</sup> ν2α<sup>3</sup> <sup>2</sup> ν3α<sup>3</sup> <sup>3</sup> ν4α<sup>3</sup> <sup>4</sup> ν5α<sup>3</sup> <sup>5</sup> ... ν*<sup>q</sup>*−2α<sup>3</sup> *<sup>q</sup>*−<sup>2</sup> 0 0 ν<sup>0</sup> ν1α<sup>4</sup> <sup>1</sup> ν2α<sup>4</sup> <sup>2</sup> ν3α<sup>4</sup> <sup>3</sup> ν4α<sup>4</sup> <sup>4</sup> ν5α<sup>4</sup> <sup>5</sup> ... ν*<sup>q</sup>*−2α<sup>4</sup> *<sup>q</sup>*−<sup>2</sup> 0 0 ... ... ... ... ... ... ... ... ... ν<sup>0</sup> ν1α*<sup>q</sup>*−*<sup>k</sup>* <sup>1</sup> <sup>ν</sup>2α*<sup>q</sup>*−*<sup>k</sup>* <sup>2</sup> <sup>ν</sup>3α*<sup>q</sup>*−*<sup>k</sup>* <sup>3</sup> <sup>ν</sup>4α*<sup>q</sup>*−*<sup>k</sup>* <sup>4</sup> <sup>ν</sup>5α*<sup>q</sup>*−*<sup>k</sup>* <sup>5</sup> ... ν*<sup>q</sup>*−2α*<sup>q</sup>*−*<sup>k</sup> <sup>q</sup>*−<sup>2</sup> 0 ν*<sup>q</sup>* ⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦

It is clear that as a nonbinary code with codeword coefficients from *GF*(2*<sup>m</sup>*), the distance properties will remain unchanged as the generalised Reed–Solomon is still an MDS code. Depending on the coordinate position each nonzero element value has a unique mapping to another nonzero element value. It is as subfield subcodes that the generalised Reed–Solomon codes have an advantage. It should be noted that Goppa codes are examples of a generalised Reed–Solomon code.

Returning to the relatively poor (16, 8, 4) binary code derived from the (16, 14, 3) MDS code, consider the generalised (16, 14, 3) Reed–Solomon code with parity check matrix.

$$\mathbf{H}\_{\text{(16,14)}} = \begin{bmatrix} \nu\_0 & \nu\_1 & \nu\_2 & \nu\_3 & \nu\_4 & \nu\_5 & \nu\_6 & \dots & \nu\_{13} & \nu\_{14} & \nu\_{15} \\ \nu\_0 & \nu\_1 \alpha^1 & \nu\_2 \alpha^2 & \nu\_3 \alpha^3 & \nu\_4 \alpha^4 & \nu\_5 \alpha^5 & \nu\_6 \alpha^6 & \dots & \nu\_{13} \alpha^{13} & \nu\_{14} \alpha^{14} & 0 \end{bmatrix} \quad (6.65)$$

Setting the vector ν to

$$\{\alpha^{12}, \alpha^{4}, \alpha^{3}, \alpha^{9}, \alpha^{4}, \alpha^{1}, \alpha^{8}, \alpha^{6}, \alpha^{3}, \alpha^{6}, \alpha^{1}, \alpha^{2}, \alpha^{2}, \alpha^{8}, \alpha^{9}, \alpha^{12}\}$$

Constructing the binary check matrix from these parity check equations using Table 6.3 by substituting the respective 4 bit vector for each column vector of each nonzero *GF*(16) symbol, (0 in *GF*(16) is 0000) produces the following binary check matrix

$$\mathbf{H}\_{(16,8,5)} = \begin{bmatrix} 1 \ 1 \ 0 \ 0 \ 1 \ 0 \ 1 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 1 \ 0 \ 1 \ 0 \\\ 1 \ 1 \ 0 \ 1 \ 1 \ 1 \ 0 \ 0 \ 0 \ 0 \ 1 \ 0 \ 0 \ 0 \ 1 \ 1 \\\ 1 \ 0 \ 0 \ 0 \ 0 \ 0 \ 1 \ 1 \ 0 \ 1 \ 0 \ 1 \ 1 \ 1 \ 0 \ 1 \\\ 1 \ 0 \ 1 \ 1 \ 0 \ 0 \ 0 \ 1 \ 1 \ 1 \ 0 \ 0 \ 0 \ 0 \ 1 \ 1 \\\\\ 1 \ 0 \ 0 \ 1 \ 1 \ 0 \ 1 \ 1 \ 0 \ 1 \ 0 \ 1 \ 1 \ 0 \ 1 \ 0 \\\ 1 \ 1 \ 1 \ 1 \ 0 \ 0 \ 0 \ 0 \ 1 \ 0 \ 1 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \\\ 1 \ 1 \ 1 \ 1 \ 1 \ 1 \ 0 \ 1 \ 1 \ 0 \ 1 \ 1 \ 0 \ 1 \ 1 \ 0 \ 1 \ 0 \\\ 1 \ 0 \ 0 \ 1 \ 0 \ 1 \ 1 \ 1 \ 0 \ 1 \ 1 \ 1 \ 1 \ 0 \ 0 \end{bmatrix} \tag{6.66}$$

Weight spectrum analysis indicates that the minimum Hamming distance of this code is 5 and achieves the aim of deriving a binary code from an MDS code without loss of minimum Hamming distance. Moreover, the additional symbol of 1, the last column in (6.59), may be appended to produce the following check matrix for the (17, 9, 5) binary code:

$$\mathbf{H}\_{(17,9,5)} = \begin{bmatrix} 1 \ 1 \ 0 \ 0 \ 1 \ 0 \ 1 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 1 \ 0 \ 1 \ 0 \ 1 \ 0 \\\ 1 \ 1 \ 0 \ 1 \ 1 \ 1 \ 0 \ 0 \ 0 \ 0 \ 1 \ 0 \ 0 \ 0 \ 1 \ 1 \ 0 \\\ 1 \ 0 \ 0 \ 0 \ 0 \ 0 \ 1 \ 1 \ 0 \ 1 \ 0 \ 1 \ 1 \ 1 \ 0 \ 1 \ 0 \\\ 1 \ 0 \ 1 \ 1 \ 0 \ 0 \ 0 \ 1 \ 1 \ 1 \ 0 \ 0 \ 0 \ 0 \ 1 \ 1 \ 0 \\\\1 \ 0 \ 0 \ 1 \ 1 \ 0 \ 1 \ 1 \ 0 \ 1 \ 0 \ 1 \ 1 \ 0 \ 1 \ 0 \ 1 \ 0 \ 1 \ 0 \ 1 \\\ 1 \ 1 \ 1 \ 1 \ 0 \ 0 \ 0 \ 0 \ 1 \ 0 \ 1 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \\\ 1 \ 1 \ 1 \ 1 \ 1 \ 1 \ 0 \ 1 \ 1 \ 0 \ 1 \ 1 \ 0 \ 1 \ 1 \ 0 \ 0 \ 0 \end{bmatrix} \tag{6.67}$$

Not surprisingly, this code has the same parameters as the best known code [5]. The reader will be asking, how is the vector ν chosen?

Using trial and error methods, it is extremely difficult, and somewhat tiresome to find a suitable vector ν, even for such a short code. Also weight spectrum analysis has to be carried out for each trial code.

The answer is that the vector ν is constructed from an irreducible Goppa polynomial of degree 2 with *g*(*z*) = α<sup>3</sup> + *z* + *z*2. Referring to Table 6.3, the reader may verify using all elements of *GF*(16), that ν is given by *g*(α*i*)−<sup>1</sup> for *i* = 0 to 15.

Unfortunately the technique is only valid for binary codes with minimum Hamming distance of 5 and also *m* has to be even. Weight spectrum analysis has confirmed that the (65, 53, 5), (257, 241, 5), (1025, 1005, 5) and (4097, 4073, 5) codes can be constructed in this way from doubly extended, generalised Reed–Solomon, MDS codes.

#### **6.9 Summary**

It has been shown that interpolation plays an important, mostly hidden role in algebraic coding theory. The Reed–Solomon codes, BCH codes, and Goppa codes are all codes that may be constructed via interpolation. It has also been demonstrated that all of these codes form part of a large family of generalised MDS codes. The encoding of BCH and Goppa codes has been explored from the viewpoint of classical Lagrange interpolation. It was shown in detail how Goppa codes are designed and constructed starting from first principles. The parity check matrix of a BCH code was derived as a Goppa code proving that BCH codes are a subset of Goppa codes. Following from this result and using properties of the cyclotomic cosets it was explained how the minimum Hamming distance of some BCH codes is able to exceed the BCH bound producing outstanding codes. It was shown how these exceptional BCH codes can be identified and constructed. A little known paper by Goppa was discussed and as a result it was shown how Goppa codes and BCH codes may be extended in length with additional parity check bits resulting in increased minimum Hamming distance of the code. Several examples were given of the technique which results in some outstanding codes. Reed–Solomon codes were explored as a means of constructing binary codes resulting in improvements to the database of best known codes.

# **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the book's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the book's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Chapter 7 Reed–Solomon Codes and Binary Transmission**

#### **7.1 Introduction**

Reed–Solomon codes named after Reed and Solomon [9] following their publication in 1960 have been used together with hard decision decoding in a wide range of applications. Reed–Solomon codes are maximum distance separable (MDS) codes and have the highest possible minimum Hamming distance. The codes have symbols from <sup>F</sup>*<sup>q</sup>* with parameters (*<sup>q</sup>* <sup>−</sup>1, *<sup>k</sup>*, *<sup>q</sup>* <sup>−</sup>*k*). They are not binary codes but frequently are used with *q* = 2*<sup>m</sup>*, and so there is a mapping of residue classes of a primitive polynomial with binary coefficients [6] and each element of F2*<sup>m</sup>* is represented as a binary*m*-tuple. Thus, binary codes with code parameters(*m*[2*<sup>m</sup>*−1], *km*, 2*<sup>m</sup>*−*k*) can be constructed from Reed–Solomon codes. Reed–Solomon codes can be extended in length by up to two symbols and in special cases extended in length by up to three symbols. In terms of applications, they are probably the most popular family of codes.

Researchers over the years have tried to come up with an efficient soft decision decoding algorithm and a breakthrough in hard decision decoding in 1997 by Madhu Sudan [10], enabled more than <sup>2</sup>*m*−*<sup>k</sup>* <sup>2</sup> errors to be corrected with polynomial time complexity. The algorithm was limited to low rate Reed–Solomon codes. An improved algorithm for all code rates was discovered by Gursuswami and Sudan [3] and led to the Guruswami and Sudan algorithm being applied in a soft decision decoder by Kötter and Vardy [5]. A very readable, tutorial style explanation of the Guruswami and Sudan algorithm is presented by McEliece [7]. Many papers followed, discussing soft decision decoding of Reed–Solomon codes [1] mostly featuring simulation results of short codes such as the (15, 11, 5) and the (31, 25, 7) code. Binary transmission using baseband bipolar signalling or binary phase shift keying (BPSK) [8] and the additive white gaussian noise (AWGN) channel is most common. Some authors have used quadrature amplitude modulation (QAM) [8] with 2*<sup>m</sup>* levels to map to each F2*<sup>m</sup>* symbol [5]. In either case, there is a poor match between

167

the modulation method and the error-correcting code. The performance achieved is not competitive compared to other error-correcting code arrangements. For binary transmission, a binary error-correcting code should be used and not a symbol-based error-correcting code. For QAM and other multilevel signalling, better performance is obtained by applying low-rate codes to the least significant bits of received symbols and high-rate codes to the most significant bits of received symbols. Applying a fixed-rate error-correcting code to all symbol bits is the reason for the inefficiency in using Reed–Solomon codes on binary channels.

Still, these modulation methods do provide a means of comparing different decoder arrangements for RS codes. This theme is explored later in Sect. 7.3 where soft decision decoding of RS codes is explored.

# **7.2 Reed–Solomon Codes Used with Binary Transmission-Hard Decisions**

Whilst RS codes are very efficient codes, being MDS codes, they are not particularly well suited to the binary channel as it will become apparent from the results presented below. Defining the RS code over F2*<sup>m</sup>* , RS codes extended with a single symbol are considered with length *n* = 2*<sup>m</sup>*, with *k* information symbols, and with *dmin* = *n*−*k*+1. The length in bits, *nb* = *mn* and there are *kb* information bits with *kb* = *km*.

The probability of a symbol error with binary transmission and the AWGN channel is

$$p\_s = 1 - \left(1 - \frac{1}{2}erfc\left(\sqrt{\frac{k}{n}\frac{E\_b}{N\_0}}\right)\right)^m \tag{7.1}$$

The RS code can correct *t* errors where *t* = *n*−*k*+1 2 . Accordingly, a decoder error occurs if there are more than *t* symbol errors and the probability of decoder error, *pC* is given by

$$p\_C = \sum\_{i=t+1}^{n} \frac{n!}{(n-i)!i!} p\_s^i (1-p\_s)^{n-i} \tag{7.2}$$

As a practical example, we will consider the (256, 234, 23) extended RS code. Representing each F<sup>28</sup> symbol as a binary 8 tuple the RS code becomes a (2048, 1872, 23) binary code. The performance with hard decisions is shown in Fig. 7.1 as a function of *Eb N*0 . This code may be directly compared to the binary (2048, 1872, 33) Goppa code since their lengths and code rates are identical. The decoder error probability for the binary Goppa code is given by

**Fig. 7.1** Comparison of hard decision decoding of the (256, 234, 23) RS code compared to the (2048, 1872, 33) Goppa code (same code length in bits and code rate)

$$p\_C = \sum\_{i=t\_0+1}^{nm} \frac{(nm)!}{(nm-i)!i!} \left(\frac{1}{2}\operatorname{erfc}\sqrt{\frac{k}{n}\frac{E\_b}{N\_0}}\right)^i \left(1-\frac{1}{2}\operatorname{erfc}\sqrt{\frac{k}{n}\frac{E\_b}{N\_0}}\right)^{nm-i} \tag{7.3}$$

where *tG* = *dmin*+<sup>1</sup> 2 for the binary Goppa code.

The comparison in performance is shown in Fig. 7.1 and it can be seen that the Goppa code is approximately 0.75dB better than the RS code at 1 × 10−<sup>10</sup> frame error rate.

It is interesting to speculate whether the performance of the RS code could be improved by using 3-level quantisation of the channel bits and erasing symbols if any of the bits within a symbol are erased. The probabilities of a bit erasure *perase* and bit error *pb* for 3-level quantisation are given in Chap. 3, Eqs. (3.41) and (3.42) respectively, but note that a lower threshold needs to be used for best performance with these code parameters, <sup>√</sup>*Es*−0.2×<sup>σ</sup> instead of <sup>√</sup>*Es*−0.65×σ. The probability of a symbol erasure, *pS erase* is given by

$$p\_{S\,ease} = 1 - (1 - p\_{ease})^m \tag{7.4}$$

and the probability of a symbol error, *pS err or* is given by

$$p\_{S\,error} = 1 - \left(1 - (1 - p\_{\,error})^m\right) - (1 - p\_b)^m \tag{7.5}$$

**Fig. 7.2** Comparison of hard decision and erasure decoding of the (256, 250, 7) RS code for the binary channel

and

$$p\_{S\,error} = (1 - p\_{erase})^m - (1 - p\_b)^m \tag{7.6}$$

For each received vector, provided the number of errors *t* and the number of erasures *s* such that 2*t* + *s* ≤ *n* − *k*, then the received vector will be decoded correctly. A decoder error occurs if 2*t* + *s* > *n* − *k*.

The probability distribution of errors and erasures in the received vector, *e*(*z*) may be easily found by defining a polynomial *p*(*z*) and raising it to the power of *n*, the number of symbols in a codeword.

$$e(z) = \left(1 - p\_{S\,error} - p\_{S\,error} + p\_{S\,error}z^{-1} + p\_{S\,error}z^{-2}\right)^n\tag{7.7}$$

The probability of decoder error *pC* is simply found from *e*(*z*) by summing all coefficients of *z*−*<sup>i</sup>* where i is greater than *n* − *k*. This is very straightforward with a symbolic mathematics program such as Mathematica. The results for the RS (256, 234, 23) code are shown in Fig. 7.1. It can be seen that there is an improvement over the hard decision case but it is rather marginal.

A rather more convincing case is shown in Fig. 7.2 for the RS (256, 250, 7) code where the performance is shown down to frame error rates of 1×10−20. In this case, there is an improvement of approximately 0.4 dB.

It has already been established that for the binary transmission channel, the RS codes based on *G F*(2*<sup>m</sup>*), do not perform as well as a binary designed code with the same code parameters. The problem is that bit errors occur independently and it only takes a single bit error to cause a symbol error. Thus, the code structure, being symbol based, is not well matched to the transmission channel. Another way of looking at this is to consider the Hamming distance. For the binary (2048, 1872) codes considered previously, the RS-based code turns out to have a binary Hamming distance of 23 whilst the binary Goppa code has a Hamming distance of 33. However, there is a simple method of modifying RS codes to produce good binary codes as discussed in Chap. 6. It is a code concatenation method best suited for producing symbol-based binary codes whereby a single overall binary parity check is added to each binary *m*-tuple representing each symbol. Starting with a RS (*n*, *k*, *n* − *k* − 1) code, adding the overall binary parity checks produces a (*n*[*m*+1], *km*, 2[*n*−*k*−1]) binary code. Now the minimum weight of each symbol is 2, producing a binary code with twice the minimum Hamming distance of the original RS code. Kasahara [4] realised that in some cases an additional information bit may be added by adding the all 1 *s* codeword to the generator matrix. Some best known codes are constructed in this way as discussed in Chap. 6. One example is the (161, 81, 23) binary code [6].

# **7.3 Reed–Solomon Codes and Binary Transmission Using Soft Decisions**

RS codes applied to the binary transmission channel will now be considered using unquantised soft decision decoding. The best decoder to use is the modified Dorsch decoder, discussed in Chap. 15, because it provides near maximum likelihood decoding. However when used with codes having a significant coding gain, the code length needs to be typically less than 200 bits.

We will consider augmented, extended RS codes constructed from *G F*(2*<sup>m</sup>*). The length is 2*<sup>m</sup>* + 1 and these are Maximum Distance Separable (MDS) codes with parameters (2*<sup>m</sup>* + 1, *k*, 2*<sup>m</sup>*+<sup>1</sup> − *k*). Moreover, the general case is that augmented, extended RS codes may be constructed using any Galois Field *G F*(*q*) with parameters (*q*+1, *k*, *q*+2−*k*) [6]. Denoting the *q* field elements as 0, α0, α1, α2,... α*<sup>q</sup>*−2, the parity-check matrix is given by

$$\mathbf{H} = \begin{bmatrix} \alpha\_0^j & \alpha\_1^j & \alpha\_2^j & \dots & \alpha\_{q-2}^j & 1 & 0\\ \alpha\_0^{j+1} & \alpha\_1^{j+1} & \alpha\_2^{j+1} & \dots & \alpha\_{q-2}^{j+1} & 0 & 0\\ \alpha\_0^{j+2} & \alpha\_1^{j+2} & \alpha\_2^{j+2} & \dots & \alpha\_{q-2}^{j+2} & 0 & 0\\ \dots & \dots & \dots & \dots & \dots & \dots & \dots\\ \alpha\_0^{j+q-k-1} & \alpha\_1^{j+q-k-1} & \alpha\_2^{j+q-k-1} & \dots & \alpha\_{q-2}^{j+q-k-1} & 0 & 0\\ \alpha\_0^{j+q-k} & \alpha\_1^{j+q-k} & \alpha\_2^{j+q-k} & \dots & \alpha\_{q-2}^{j+q-k} & 0 & 1 \end{bmatrix}^T$$



There are *q* − *k* + 1 rows of the matrix corresponding to the *q* − *k* + 1 parity symbols of the code. Any of the *q* − *k* + 1 columns form a Vandermonde matrix and the matrix is non-singular which means that any set of *q* − *k* + 1 symbols of a codeword may be erased and solved using the parity-check equations. Thus, the code is MDS. The columns of the parity-check matrix may be permuted into any order and any set of *s* symbols of a codeword may be defined as parity symbols and permanently erased. Thus, their respective columns of **H** may be deleted to form a shortened (2*<sup>m</sup>* + 1 − *s*, *k*, 2*<sup>m</sup>*+<sup>1</sup> − *s* − *k*) MDS code. This is an important property of MDS codes, particularly for their practical realisation in the form of augmented, extended RS codes because it enables efficient implementation in applications such as incremental redundancy systems, discussed in Chap. 17, and network coding. Using the first *q* − 1 columns of **H**, and setting α0, α1, α2,... α*<sup>q</sup>*−<sup>2</sup> equal to α<sup>0</sup>, α<sup>1</sup>, α<sup>2</sup>,... α*<sup>q</sup>*−2, where α is a primitive element of *G F*(*q*) a cyclic code may be constructed, which has advantages for encoding and decoding implementation.

We will consider the shortened RS code (30, 15, 16) constructed from the *G F*(2<sup>5</sup>) extension field with **H** constructed using *j* = 0 and α being the primitive root of 1 + *x* <sup>2</sup> + *x* 5. The *G F*(32) extension field table is given in Table 7.1 based on the primitive polynomial 1 + *x* <sup>2</sup> + *x* <sup>5</sup> so that 1 + α<sup>2</sup> + α<sup>5</sup> = 0, modulo 1 + *x* 31.

The first step in the construction of the binary code is to construct the parity-check matrix for the shortened RS code (30, 15, 16) which is

$$\mathbf{H}\_{(3\mathbf{0},15)} = \begin{vmatrix} 1 & 1 & 1 & \dots & 1 \\ 1 & \alpha & \alpha^2 & \dots \alpha^{29} \\ 1 & \alpha^2 & \alpha^4 & \dots \alpha^{27} \\ 1 & \alpha^3 & \alpha^6 & \dots \alpha^{25} \\ \dots & \dots & \dots & \dots & \dots \\ 1 & \alpha^{13} & \alpha^{26} & \dots \alpha^{5} \\ 1 & \alpha^{14} & \alpha^{28} & \dots \alpha^{3} \end{vmatrix}$$

Each element of this parity-check matrix is to be replaced with a 5×5 matrix in terms of the base field, which in this case is binary. First, the number of rows are expanded to form **H**(**30**,**75**) given by matrix (7.8). The next step is to expand the columns in terms of the base field by substituting for powers of α using Table 7.1. For example, if an element of the parity-check matrix **H**(**30**,**75**) is, say α26, then this is replaced by 1 + α + α<sup>2</sup> + α<sup>4</sup> which in binary is 11101. Proceeding in this way the binary matrix **H**(**150**,**75**) is produced (some entries have been left as they were to show the procedure partly completed) as in matrix (7.9).

**H**(**30**,**75**) = ⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ 11 1 ... 1 α α α ... α α<sup>2</sup> α<sup>2</sup> α<sup>2</sup> ... α<sup>2</sup> α<sup>3</sup> α<sup>3</sup> α<sup>3</sup> ... α<sup>3</sup> α<sup>4</sup> α<sup>4</sup> α<sup>4</sup> ... α<sup>4</sup> 1 α α<sup>2</sup> ... α<sup>29</sup> α α<sup>2</sup> α<sup>3</sup> ... α<sup>30</sup> α<sup>2</sup> α<sup>3</sup> α<sup>4</sup> ... 1 α<sup>3</sup> α<sup>4</sup> α<sup>5</sup> ... α α<sup>4</sup> α<sup>5</sup> α<sup>6</sup> ... α<sup>2</sup> 1 α<sup>2</sup> α<sup>4</sup> ... α<sup>27</sup> α α<sup>3</sup> α<sup>5</sup> ... α<sup>28</sup> α<sup>2</sup> α<sup>4</sup> α<sup>6</sup> ... α<sup>29</sup> α<sup>3</sup> α<sup>5</sup> α<sup>7</sup> ... α<sup>30</sup> α<sup>4</sup> α<sup>6</sup> α<sup>8</sup> ... 1 1 α<sup>3</sup> α<sup>6</sup> ... α<sup>25</sup> α α<sup>4</sup> α<sup>7</sup> ... α<sup>26</sup> α<sup>2</sup> α<sup>5</sup> α<sup>8</sup> ... α<sup>27</sup> α<sup>3</sup> α<sup>6</sup> α<sup>9</sup> ... α<sup>28</sup> α<sup>4</sup> α<sup>7</sup> α<sup>10</sup> ... α<sup>27</sup> ... ... ... ... ... 1 α<sup>14</sup> α<sup>28</sup> ... α<sup>3</sup> α α<sup>15</sup> α<sup>29</sup> ... α<sup>4</sup> α<sup>2</sup> α<sup>16</sup> α<sup>30</sup> ... α<sup>5</sup> α<sup>3</sup> α<sup>17</sup> 1 ... α<sup>6</sup> α<sup>4</sup> α<sup>18</sup> α ... α<sup>7</sup> ⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦

(7.8)


The resulting binary code is a (150, 75, 16) code with the *dmin* the same as the symbol-based RS (30, 15, 16) code. As observed by MacWilliams [6], changing the basis can increase the *dmin* of the resulting binary code, and making *j* = 3 in the RS parity-check matrix above produces a (150, 75, 19) binary code.

A (150, 75, 22) binary code with increased *dmin* can be constructed using the overall binary parity-check concatenation as discussed above. Starting with the (25, 15, 11) RS code, an overall parity check is added to each symbol, producing a paritycheck matrix, **H**(**150**,**75**,**22**) given by matrix (7.10). We have constructed two binary (150, 75) codes from RS codes. It is interesting to compare these codes to the known best code of length 150 and rate <sup>1</sup> <sup>2</sup> . The known, best codes are to be found in a database [2] and the best (150, 75) code has a *dmin* of 23 and is derived by shortening by one bit (by deleting the *x* <sup>150</sup> coordinate from the **G** matrix) of the (151, 76, 23) cyclic code whose generator polynomial is


**H**(**150**,**75**,**22**) = ⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ 100001 010001 001001 ... 100100 010001 001001 000101 ... 010010 001001 000101 000011 ... 100001 000101 000011 101000 ... 010001 000011 101000 010100 ... 001001 100001 001001 000011 ... 110101 010001 000101 101000 ... 011011 001001 000011 010100 ... 100100 000101 101000 001010 ... 010010 000011 010100 101101 ... 100001 100001 000101 010100 ... 100111 010001 000011 001010 ... 111010 001001 101000 101101 ... 110101 000101 010100 010111 ... 011011 000011 001010 100010 ... 110101 ... ... ... ... ... 100001 101110 011011 ... 000101 010001 111111 100100 ... 000011 001001 110110 010010 ... 101000 000101 110011 1 ... 010100 000011 110000 010001 ... 001010 ⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ (7.10)

$$g(x) = 1 + x^3 + x^5 + x^8 + x^{10} + x^{11} + x^{14} + x^{15} + x^{17} + x^{19} + x^{20} + x^{22}$$

$$\begin{aligned} &+ x^{25} + x^{27} + x^{28} + x^{30} + x^{31} + x^{34} + x^{36} + x^{37} + x^{39} + x^{40} + x^{45} + x^{46} \\ &+ x^{48} + x^{50} + x^{52} + x^{59} + x^{60} + x^{63} + x^{67} + x^{70} + x^{73} + x^{74} + x^{75} \end{aligned} \tag{7.11}$$

These three binary codes, the RS-based (150, 75, 19) and (150, 75, 22) codes together with the (150, 75, 23) shortened cyclic code have been simulated using binary transmission for the AWGN channel. The decoder used is a modified Dorsch decoder set to evaluate 2 × 10<sup>7</sup> codewords per received vector. This is a large number of codewords and is sufficient to ensure that quasi-maximum likelihood performance is obtained. In this way, the true performance of each code is revealed rather than any shortcomings of the decoder.

The results are shown in Fig. 7.3. Also shown in Fig. 7.3, for comparison purposes, is the sphere packing bound and the erasure-based binomial bound discussed in Chap. 1. Interestingly, all three codes have very good performance and are very close to the erasure-based binomial bound. Although not close to the sphere packing bound, this bound is for non-binary codes and there is an asymptotic loss of 0.187 dB for rate <sup>1</sup> <sup>2</sup> binary codes in comparison to the sphere packing bound as the code length extends towards ∞.

Comparing the three codes, no code has the best overall performance over the entire range of *Eb N*0 , and, surprisingly the *dmin* of the code is no guide. The reason for this can be seen from the Hamming distances of the codewords decoded in error for

**Fig. 7.3** Comparison of the (150, 75, 19) code derived from the RS(30, 15, 16) code, the concatenated (150, 75, 22) code and the known, best (150, 75, 23) code derived by shortening the (151, 76, 23) cyclic code

the three codes after 100 decoder error events. The results are shown in Table 7.2 at *Eb <sup>N</sup>*<sup>0</sup> = 3 dB. From Table 7.2 it can be seen that the concatenated code (150, 75, 22) has more error events with Hamming distances in the range 22–32, but the (150, 75, 23) known, best code has more error events for Hamming distances up to 36 compared to the (150, 75, 19) RS derived code, and this is the best code at *Eb <sup>N</sup>*<sup>0</sup> = 3 dB.

The distribution of error events is illustrated by the cumulative distribution of error events plotted in Fig. 7.4 as a function of Hamming distance. The weakness of the (150, 75, 22) code at *Eb <sup>N</sup>*<sup>0</sup> = 3 dB is apparent.

At higher values of *Eb N*0 , the higher *dmin* of the (150, 75, 23) known, best code causes it to have the best performance as can be seen from Fig. 7.3.

#### **7.4 Summary**

This chapter studied further the Reed–Solomon codes which are ideal symbol-based codes because they are Maximum Distance Separable (MDS) codes. These codes are not binary codes but were considered for use as binary codes in this chapter. The performance of Reed–Solomon codes when used on a binary channel was explored and compared to codes which are designed for binary transmission. The construction of the parity-check matrices of RS codes for use as binary codes was described


**Table 7.2** Hamming distances and multiplicities of 100 error events for each of the (150, 75) codes at *Eb <sup>N</sup>*<sup>0</sup> = 3 dB

in detail for specific code examples. The performance results of three differently constructed (150, 75) codes simulated for the binary AWGN channel, using a near maximum likelihood decoder, were presented. Surprisingly the best performing code at 10−<sup>4</sup> error rate is not the best, known (150, 75, 23) code. Error event analysis was presented which showed that this was due to the higher multiplicities of weight 32–36 codeword errors. However, beyond 10−<sup>6</sup> error rates the best, known (150, 75, 23) code was shown to be the best performing code.

**Fig. 7.4** Cumulative distribution of Hamming distance error events for the (150, 75, 19) code derived from the RS(30, 15, 16) code, the RS binary parity-check concatenated (150, 75, 22) code and the known, best (150, 75, 23) code derived by shortening the (151, 76, 23) cyclic code

#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the book's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the book's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Chapter 8 Algebraic Geometry Codes**

## **8.1 Introduction**

In order to meet channel capacity, as Shannon postulated, long error-correction codes with large minimum distances need to be found. A large effort in research has been dedicated to finding algebraic codes with good properties and efficient decoding algorithms. Reed–Solomon (RS) codes are a product of this research and have over the years found numerous applications, the most noteworthy being their implementation in satellite systems, compact discs, hard drives and modern, digitally based communications. These codes are defined with non-binary alphabets and have the maximum achievable minimum distance for codes of their lengths. A generalisation of RS codes was introduced by Goppa using a unique construction of codes from algebraic curves. This development led to active research in that area so that currently the complexity of encoding and decoding these codes has been reduced greatly from when they were first presented. These codes are algebraic geometry (AG) codes and have much greater lengths than RS codes with the same alphabets. Furthermore these codes can be improved if curves with desirable properties can be found. AG codes have good properties and some families of these codes have been shown to be asymptotically superior as they exceed the well-known Gilbert–Varshamov bound [16] when the defining finite field <sup>F</sup>*<sup>q</sup>* has size *<sup>q</sup>* <sup>≥</sup> 49 with *<sup>q</sup>* always a square.

#### **8.2 Motivation for Studying AG Codes**

Aside from their proven superior asymptotic performance when the field size *q*<sup>2</sup> > 49, AG codes defined in much smaller fields have very good parameters. A closer look at tables of best-known codes in [8, 15] shows that algebraic geometry codes feature as the best-known linear codes for an appreciable range of code lengths for different field sizes *q*. To demonstrate a comparison the parameters of AG codes with shortened BCH codes in fields with small sizes and characteristic 2 is given. AG codes of length *n*, dimension *k* have minimum distance *d* = *n* −*k* −*g* +1 where *g* is called the genus . Notice that *n* − *k* + 1 is the distance of a maximum distance (MDS) separable code. The genus *g* is then the Singleton defect *s* of an AG code. The Singleton defect is simply the difference between the distance of a code and the distance some hypothetical MDS code of the same length and dimension. Similarly a BCH code is a code with length *n*, dimension *k*, and distance *d* = *n* − *k* − *s* + 1 where *s* is the Singleton defect and number of non-consecutive roots of the BCH code.

Consider Table 8.1, which compares the parameters of AG codes from three curves with genera 3, 7, and 14 with shortened BCH codes with similar code rates. At high rates, BCH codes tend to have better minimum distances or smaller Singleton defects. This is because the roots of the BCH code with high rates are usually cyclically consecutive and thus contribute to the minimum distance. For rates close to half, AG codes are better than BCH codes since the number of non-consecutive roots of the BCH code is increased as a result of conjugacy classes. The AG codes benefit from the fact that their Singleton defect or genus remains fixed for all rates. As a consequence AG codes significantly outperform BCH codes at lower rates. However, the genera of curves with many points in small finite fields are usually large and as the length of the AG codes increases in F8, the BCH codes beat AG codes in performance. Tables 8.2 and 8.3 show the comparison between AG and BCH codes in fields F<sup>16</sup> and F32, respectively. With larger field sizes, curves with many points and small genera can be used, and AG codes do much better than BCH codes. It is worth noting that Tables 8.1, 8.2 and 8.3 show codes in fields with size less than 49.

#### *8.2.1 Bounds Relevant to Algebraic Geometry Codes*

Bounds on the performance of codes that are relevant to AG codes are presented in order to show the performance of these codes. Let *Aq* (*n*, *d*) represent the number of codewords in the code space of a code *C* with length *n*, minimum distance *d* and defined over a field of size *q*. Let the information rate be *R* = *k*/*n* and the relative minimum distance be δ = *d*/*n* for 0 ≤ δ ≤ 1, then

$$\alpha\_q(\delta) = \lim\_{n \to \infty} \frac{1}{n} A\_q(n, \delta n)$$

which represents the *k*/*n* such that there exists a code over a field of size *q* that has *d*/*n* converging to δ [18]. The *q*-ary entropy function is given by

$$H\_q(\mathbf{x}) = \begin{cases} 0, & \mathbf{x} = \mathbf{0} \\ \mathbf{x} \log\_q(q - 1) - \mathbf{x} \log\_q \mathbf{x} - (1 - \mathbf{x}) \log\_q(1 - \mathbf{x}), & \mathbf{0} < \mathbf{x} \le \theta \end{cases}$$


**Table 8.1** Comparison between BCH and AG codes in F<sup>8</sup>

**Table 8.2** Comparison between BCH and AG codes in F<sup>16</sup>



**Table 8.3** Comparison between BCH and AG codes in F<sup>32</sup>

The asymptotic Gilbert–Varshamov lower bound on α*<sup>q</sup>* (δ) is given by,

$$\alpha\_q(\delta) \ge 1 - H\_q(\delta) \quad \text{for } 0 \le \delta \le \theta$$

The Tsfasman–Vladut–Zink bound is a lower bound on α*<sup>q</sup>* (δ) and holds true for certain families of AG codes, it is given by

$$\alpha\_q(\delta) \ge 1 - \delta - \frac{1}{\sqrt{q} - 1} \text{ where } \sqrt{q} \in \mathbb{N}/0.$$

The supremacy of AG codes lies in the fact that the TVZ bound ensures that these codes have better performance when *q* is a perfect square and *q* ≥ 49.

The Figs. 8.1, 8.2 and 8.3 show the *R* vs δ plot of these bounds for some range of *q*.

**Fig. 8.1** Tsfasman–Vladut–Zink and Gilbert–Varshamov bound for *q* = 32

**Fig. 8.2** Tsfasman–Vladut–Zink and Gilbert–Varshamov bound for *q* = 64

**Fig. 8.3** Tsfasman–Vladut–Zink and Gilbert–Varshamov bound for *q* = 256

#### **8.3 Curves and Planes**

In this section, the notion of curves and planes are introduced. Definitions and discussions are restricted to two-dimensional planes and all polynomials are assumed to be defined with coefficients in the finite field F*<sup>q</sup>* . The section draws from the following sources [2, 12, 17, 18]. Let *<sup>f</sup>* (*x*, *<sup>y</sup>*) be a polynomial in the bivariate ring <sup>F</sup>*<sup>q</sup>* [*x*, *<sup>y</sup>*].

**Definition 8.1** (*Curve*) A curve is the set of points for which the polynomial *f* (*x*, *y*) vanishes to zero. Mathematically, a curve *X* is associated with a polynomial *f* (*x*, *y*) so that *f* (*P*) = {0|*P* ∈ *X* }.

A curve is a subset of a plane. There are two main types of planes; the affine plane and the projective plane. These planes are multidimensional, however, we restrict our discussion to two-dimensional planes only.

**Definition 8.2** (*Affine Plane*) A two-dimensional affine plane denoted by A<sup>2</sup>(F*<sup>q</sup>* ) is a set of points,

$$\mathbb{A}^2(\mathbb{F}\_q) = \{ (\alpha, \beta) \, : \, \alpha, \beta \in \mathbb{F}\_q \} \tag{8.1}$$

which has cardinality *q*2.

A curve *<sup>X</sup>* is called an affine curve if *<sup>X</sup>* <sup>⊂</sup> <sup>A</sup><sup>2</sup>(F*<sup>q</sup>* ).

**Definition 8.3** (*Projective Plane*) A two-dimensional projective plane P<sup>2</sup>(F*<sup>q</sup>* ) is the algebraic closure of A<sup>2</sup> and is defined as the set of equivalence points,

$$\mathbb{P}^2(\mathbb{F}\_q) = \{ (\alpha \colon \beta \colon 1) \colon \alpha, \beta \in \mathbb{F}\_q \} \bigcup \{ (\alpha \colon 1 \colon 0) \colon \alpha \in \mathbb{F}\_q \} \bigcup \{ (1 \colon 0 \colon 0) \}.$$

A curve *<sup>X</sup>* is said to lie in the projective plane if *<sup>X</sup>* <sup>⊂</sup> <sup>P</sup>2(F*<sup>q</sup>* ). The points in the projective plane are called equivalence points since for any point *<sup>P</sup>* <sup>∈</sup> <sup>P</sup>2(F*<sup>q</sup>* ),

$$\text{if } f(\mathbf{x}\_0, \mathbf{y}\_0, z\_0) = 0, \quad \text{then } f(\alpha \mathbf{x}\_0, \alpha \mathbf{y}\_0, \alpha z\_0) = 0 \quad \text{or} \quad \mathbb{F}\_q^\*, \text{ } P = (\mathbf{x}\_0 : \mathbf{y}\_0 : z\_0)$$

because *f* (*x*, *y*,*z*) is homogeneous. The colons in the notation of a projective point (*x* : *y* : *z*) represents this equivalence property.

The affine polynomial *f* (*x*, *y*) is in two variables, in order to define a projective polynomial in three variables, *homogenisation* is used,

$$f(\mathbf{x}, \mathbf{y}, z) = z^d f\left(\frac{\mathbf{x}}{z}, \frac{\mathbf{y}}{z}\right) \quad d = \text{Degree of } \ f(\mathbf{x}, \mathbf{y})$$

which turns *f* (*x*, *y*) into a homogeneous<sup>1</sup> polynomial in three variables. An *<sup>n</sup>*-dimensional projective polynomial has *<sup>n</sup>* <sup>+</sup> 1 variables. The affine space <sup>A</sup><sup>2</sup>(F*<sup>q</sup>* ) is a subset of P<sup>2</sup>(F*<sup>q</sup>* ) and is given by,

$$\mathbb{A}^2(\mathbb{F}\_q) = \{ (\alpha : \beta : 1) \, : \, \alpha, \beta \in \mathbb{F}\_q \} \subset \mathbb{P}^2(\mathbb{F}\_q).$$

A projective curve can then be defined as a set of points,

$$\mathcal{X}' = \{ P \, : \, f(P) = 0, \,\, P \in \mathbb{P}^2(\mathbb{F}\_q) \}.$$

**Definition 8.4** (*Point at Infinity*) A point on a projective curve *X* that coincides with any of the points of P<sup>2</sup>(F*<sup>q</sup>* ) of the form,

$$\{ (\alpha : 1 : 0) : \alpha \in \mathbb{F}\_q \} \cup \{ (1 : 0 : 0) \}$$

i.e. points (*x*<sup>0</sup> : *y*<sup>0</sup> : *z*0) for which *z*<sup>0</sup> = 0 is called a point at infinity.

A third plane, called the bicyclic plane [1], is a subset of the A<sup>2</sup>(F*<sup>q</sup>* ) and consists of points,

$$\{ (\alpha, \beta) : \alpha, \beta \in \mathbb{F}\_q \; \; \; \{0\} \}.$$

This plane was defined so as to adapt the Fourier transform to AG codes since the inverse Fourier transform is undefined for zero coordinates.

*Example 8.1* Consider the two-dimensional affine plane A<sup>2</sup>(F4). Following the definition of A<sup>2</sup>(F4) we have,

<sup>1</sup>Each term in the polynomial has degree equal to *d*.

$$\begin{array}{llll}(0,0) & (0,1) & (1,0) & (1,1) \\ (1,\alpha) & (\alpha,1) & (1,\alpha^2) & (\alpha^2,1) \\ (\alpha^2,\alpha) & (\alpha,\alpha^2) & (0,\alpha^2) & (0,\alpha) \\ (\alpha^2,0) & (\alpha,0) & (\alpha^2,\alpha^2) & (\alpha,\alpha) \\ \end{array}$$

where α is the primitive element of the finite field F4. The two-dimensional projective plane P2(F4) is given by,


**Definition 8.5** (*Irreducible Curve*) A curve associated with a polynomial *f* (*x*, *y*,*z*) that cannot be reduced or factorised is called *irreducible*.

**Definition 8.6** (*Singular Point*) A point on a curve is singular if its evaluation on all partial derivatives of the defining polynomial with respect to each indeterminate is zero.

Suppose *fx* , *f <sup>y</sup>* , and *fz* denote partial derivatives of *f* (*x*, *y*,*z*) with respect to *x*, *y*, and *z* respectively. A point *P* ∈ *X* is singular if,

$$\begin{aligned} \frac{\partial f(\mathbf{x}, \mathbf{y}, z)}{\partial \mathbf{x}} = f\_x, \frac{\partial f(\mathbf{x}, \mathbf{y}, z)}{\partial \mathbf{y}} = f\_\mathbf{y}, \frac{\partial f(\mathbf{x}, \mathbf{y}, z)}{\partial z} = f\_z, \\\ f\_x(P) = f\_\mathbf{y}(P) = f\_z(P) = 0. \end{aligned}$$

**Definition 8.7** (*Smooth Curve*) A curve *X* is nonsingular or smooth does not contain any singular points.

To obtain AG codes, it is required that the defining curve is both irreducible and smooth.

**Definition 8.8** (*Genus*) The genus of a curve can be seen as a measure of how many bends a curve has on its plane. The genus of a smooth curve defined by *f* (*x*, *y*,*z*) is given by the Plücker formula,

$$g = \frac{(d-1)(d-2)}{2}, \quad d = \text{Degree of } f(\mathbf{x}, \mathbf{y}, \mathbf{z})$$

The genus plays an important role in determining the quality of AG codes. It is desirable for curves that define AG codes to have small genera.

*Example 8.2* Consider the Hermitian curve in F<sup>4</sup> defined as,

$$\begin{aligned} f(\mathbf{x}, \mathbf{y}) &= \mathbf{x}^3 + \mathbf{y}^2 + \mathbf{y} \quad \text{affine} \\ f(\mathbf{x}, \mathbf{y}, \mathbf{z}) &= \mathbf{x}^3 + \mathbf{y}^2 \mathbf{z} + \mathbf{y} \mathbf{z}^2 \quad \text{projective via homogeneous} \end{aligned}$$

It is straightforward to verify that the curve is irreducible. The curve has the following projective points,

$$\begin{array}{llll}(0:0:1) & (0:1:1) & (\alpha:\alpha:1) \ (\alpha:\alpha^2:1) \\ (\alpha^2:\alpha:1) & (\alpha^2:\alpha^2:1) & (1:\alpha:1) \ (1:\alpha^2:1) \ (0:1:0) \end{array}$$

Notice the curve has a single point at infinity *P*<sup>∞</sup> = (0 : 1 : 0). One can easily check that the curve has no singular points and is thus smooth.

#### *8.3.1 Important Theorems and Concepts*

The length of an AG code is utmost the number of points on the defining curve. Since it is desirable to obtain codes that are as long as possible, it is desirable to know what the maximum number of points attainable from a curve, given a genus is.

**Theorem 8.1** (Hasse–Weil with Serre's Improvement [2]) *The Hasse–Weil theorem with Serre's improvement says that the number of rational points*<sup>2</sup> *of an irreducible curve, n, with genus g in* F*<sup>q</sup> is upper bounded by,*

$$n \le q + 1 + g\lfloor 2\sqrt{q} \rfloor \dots$$

Curves that meet this bound are called *maximal* curves. The Hermitian curves are examples of maximal curves. Bezout's theorem is an important theorem, and is used to determine the minimum distance of algebraic geometry codes. It describes the size of the set which is the intersection of two curves in the projective plane.

**Theorem 8.2** (Bezout's Theorem [2]) *Any two curves X<sup>a</sup> and X<sup>b</sup> with degrees of their associated polynomials as m and n respectively, have utmost m* × *n common roots in the projective plane counted with multiplicity.*

**Definition 8.9** (*Divisor*) A divisor on a curve *X* is a formal sum associated with the points of the curve.

$$D = \sum\_{P \in \mathcal{X}} n\_P P$$

where *n <sup>p</sup>* are integers.

<sup>2</sup>A rational point is a point of degree one. See Sect. 8.4 for the definition of the degree of point on a curve.

A zero divisor is one that has *n <sup>p</sup>* = 0 for all *P* ∈ *X* . A divisor is called effective if it is not a zero divisor. The support of a divisor is a subset of *X* for which *np* = 0. The degree of a divisor is given as,

$$\deg(D) = \sum\_{P \in \mathcal{K}} n\_P \deg(P).$$

For simplicity, it is assumed that the degree of points *P* ∈ *X* , i.e. *deg*(*P*) is 1 (points of higher degree are discussed in Sect. 8.4). Addition of two divisors *D*<sup>1</sup> = *<sup>P</sup>*∈*<sup>X</sup> <sup>n</sup> <sup>p</sup> <sup>P</sup>* and *<sup>D</sup>*<sup>2</sup> <sup>=</sup> *<sup>P</sup>*∈*<sup>X</sup> n*´ *<sup>p</sup> P* is so defined,

$$D\_1 + D\_2 = \sum\_{P \in \mathcal{K}} (n\_p + \acute{n}\_p) P.$$

Divisors are simply book-keeping structures that store information on points of a curve. Below is an example the intersection divisor of two curves.

*Example 8.3* Consider the Hermitian curve in F<sup>4</sup> defined as,

$$f\_1(\mathbf{x}, \mathbf{y}, z) = \mathbf{x}^3 + \mathbf{y}^2 z + \mathbf{y}z^2$$

with points given in Example 8.2 and the curve defined by

$$f\_2(x, y, z) = x$$

with points

$$(0:0:1)\ (0:1:1)\ (0:\alpha:1)\ (0:\alpha^2:1)\ (0:1:0)$$

These two curves intersect at 3 points below all with multiplicity 1,

$$(0:0:1)\ (0:1:0)\ (0:1:1).$$

Alternatively, this may be represented using a divisor *D*,

$$D = (0:0:1) + (0:1:0) + (0:1:1)$$

with *np* the multiplicity, equal to 1 for all the points. Notice that the two curves meet at exactly *deg*( *f*1)*deg*( *f*2) = 3 points in agreement with Bezout's theorem.

For rational functions with denominators, points in divisor with *np* < 0 are poles. For example, *D* = *P*<sup>1</sup> − 2*P*<sup>2</sup> will denote an intersection divisor of two curves that have one zero *P*<sup>1</sup> and pole *P*<sup>2</sup> with multiplicity two in common. Below is the formal definition of the field of fractions of a curve *X* .

**Definition 8.10** (*Field of fractions*) The field of fractions F*<sup>q</sup>* (*X* ) of a curve *X* defined by a polynomial *f* (*x*, *y*,*z*) contains all rational functions of the form

$$\frac{\mathbf{g}(\mathbf{x},\mathbf{y},z)}{h(\mathbf{x},\mathbf{y},z)}$$

with the restriction that *g*(*x*, *y*,*z*) and *h*(*x*, *y*,*z*) are homogeneous polynomials that have the same degree and are not divisible by *f* (*x*, *y*,*z*).

A subset (Riemann–Roch space) of the field of fractions of *X* meeting certain conditions are evaluated at points of the curve *X* to form codewords of an AG code. Thus, there is a one-to-one mapping between rational functions in this subset and codewords of an AG code. The Riemann–Roch theorem defines this subset and gives a lower bound on the dimension of AG codes. The definition of a Riemann–Roch space is given.

**Definition 8.11** (*Riemann–Roch Space*) The Riemann–Roch space associated with a divisor *D* is given by,

$$L(D) = \{ t \in \mathbb{F}\_q(\mathcal{K} \urcorner) | (t) + D \ge 0 \} \cup 0$$

where F*<sup>q</sup>* (*X* )is the field of fractions and (*t*)is the intersection divisor3 of the rational function *t* and the curve *X* .

Essentially, the Riemann–Roch space associated with a divisor *D* is a set of functions of the form *t* from the field of fractions F*<sup>q</sup>* (*X* ) such that the divisor sum (*t*) + *D* has no poles, i.e. (*t*) + *D* ≥ 0.

The rational functions in *L*(*D*) are functions from the field of fractions F*<sup>q</sup>* (*X* ) that must have poles only in the zeros (positive terms) contained in the divisor *D*, each pole occurring with utmost the multiplicity defined in the divisor *D* and most have zeros only in the poles (negative terms) contained in the divisor *D*, each zero occurring with at least the multiplicity defined in the divisor *D*.

*Example 8.4* Suppose a hypothetical curve *X* has points of degree one,

$$\mathcal{X}^\cdot = \{P\_1, P\_2, P\_3, P\_4\}$$

We choose a divisor *D* = 2*P*<sup>1</sup> − 5*P*<sup>2</sup> with degree −3, and define a Riemann–Roch space *L*(*D*). If we randomly select three functions *t*1, *t*2, and *t*<sup>3</sup> from the field of fractions F*<sup>q</sup>* (*X* ) such that they have divisors,

$$
\Delta(t\_1) = -\mathfrak{P}P\_1 + \mathfrak{P}P\_2 + 4P\_4 \ (t\_2) = 2P\_1 + 4P\_2 \ (t\_3) = -P\_1 + 8P\_2 + P\_3 \ (t\_4)
$$

*t*<sup>1</sup> ∈/ *L*(*D*) since (*t*1) + *D* = −*P*<sup>1</sup> + 4*P*<sup>4</sup> contains negative terms or poles. Also, *t*<sup>2</sup> ∈/ *L*(*D*) since (*t*2) + *D* = 4*P*<sup>1</sup> − *P*<sup>2</sup> contains negative terms. However, *t*<sup>3</sup> ∈ *L*(*D*) since (*t*3)<sup>+</sup> *<sup>D</sup>* <sup>=</sup> *<sup>P</sup>*1+3*P*2<sup>+</sup> *<sup>P</sup>*<sup>3</sup> contains no negative terms. Any function *<sup>t</sup>* <sup>∈</sup> <sup>F</sup>*<sup>q</sup>* (*<sup>X</sup>* ) is also in *L*(*D*) if it has a pole at *P*<sup>1</sup> with multiplicity at most 2 (with no other poles in common with *X* ) and a zero at *P*<sup>2</sup> with multiplicity at least 5.

<sup>3</sup>An intersection divisor is a divisor that contains information on the points of intersection of two curves.

The Riemann–Roch space is a vector space (with rational functions as elements) thus, a set of basis functions. The size of this set is the dimension of the space.

**Theorem 8.3** (Riemann Roch Theorem [2]) *Let X be a curve with genus g and D any divisor with degree* (*D*) > 2*g* − 2*, then the dimension of the Riemann–Roch space associated with D, denoted by l*(*D*) *is,*

$$l(D) = degree(D) - g + 1$$

Algebraic geometry codes are the image of an evaluation map of a Riemann–Roch space associated with a divisor *D* so that

$$L(D) \to \mathbb{F}\_q^n$$

$$t \to (t(P\_1), t(P\_2), \dots, t(P\_n))$$

where *X* = {*P*1, *P*2,..., *Pn*, *Px* } is a smooth irreducible projective curve of genus *g* defined over F*<sup>q</sup>* . The divisor *D* must have no points in common with a divisor *T* associated with *X* , i.e. it has support disjoint from *T* . For example, if the divisor *T* is of the form

$$T = P\_1 + P\_2 + \dots + P\_n$$

then, *D* = *m Px* .

Codes defined by the divisors *T* and *D* = *m Px* are called one-point AG codes (since the divisor *D* has a support containing only one point), and AG codes are predominantly defined as so since the parameters of such codes are easily determined [10].

#### *8.3.2 Construction of AG Codes*

The following steps are necessary in order to construct a generator matrix of an AG code,


*Example 8.5* Consider again the Hermitian curve defined in F<sup>4</sup> as,

$$f(\mathbf{x}, \mathbf{y}, z) = x^3 + \mathbf{y}^2 z + \mathbf{y}z^2$$

1. In Example 8.2 this curve was shown to have 8 affine points and one point at infinity. The genus of this curve is given by the Plücker formula,

$$g = \frac{(r-1)(r-2)}{2} = 1$$

where *r* = 3 is the degree of *f* (*x*, *y*,*z*).

2. Let *D* = 5*P*<sup>∞</sup> where *P*<sup>∞</sup> = (0: 1: 0) and *T* be the sum of all 8 affine points. The dimension of the Riemann–Roch space is then given by,

$$l(\mathfrak{S}P\_{\infty}) = \mathfrak{S} - 1 + 1 = \mathfrak{S}$$

thus, the AG code has dimension *k* = 5.

3. The basis functions for the space *L*(5*P*∞) are

$$\{t\_1, \dots, t\_k\} = \left\{1, \frac{\mathbf{x}}{z}, \frac{\mathbf{x}^2}{z^2}, \frac{\mathbf{y}}{z}, \frac{\mathbf{xy}}{z^2}\right\}$$

By examining the basis, it is clear that *t*<sup>1</sup> = 1 has no poles, thus, (*t*1) + *D* has no poles also. Basis functions with denominator *z* have (*ti*) = *S* − *P*∞, where *S* is a divisor of the numerator. Thus, (*ti*) + *D* has no poles. Basis functions with denominator *z*<sup>2</sup> have (*tj*) = *S* − 2*P*∞, where *S* is a divisor of the numerator. Thus, (*tj*) + *D* also has no poles.

4. The generator matrix of the Hermitian code defined with divisor *D* = 5*P*<sup>∞</sup> is thus,

$$\begin{aligned} G &= \begin{bmatrix} t\_1(P\_1) & \cdots & t\_1(P\_n) \\ \vdots & \ddots & \vdots \\ t\_k(P\_1) & \cdots & t\_k(P\_n) \end{bmatrix} \\ &= \begin{bmatrix} 1 \ 0 \ 0 \ 0 \ 0 \ \alpha^2 \ \alpha^2 \ 1 \\ 0 \ 1 \ 0 \ 0 \ 0 \ \alpha^2 \ \alpha \ 0 \\ 0 \ 0 \ 1 \ 0 \ 0 \ \alpha \ 1 \ \alpha \\ 0 \ 0 \ 0 \ 1 \ 0 \ \alpha \ 0 \ \alpha^2 \\ 0 \ 0 \ 0 \ 0 \ 1 \ 1 \ 1 \ 1 \end{bmatrix} \end{aligned}$$

*Example 8.6* Consider the curve defined in F<sup>8</sup> as,

$$f(\mathbf{x}, \mathbf{y}, z) = \mathbf{x}$$

1. This curve is a straight line and has 8 affine points of the form (0 : β : 1) and one point at infinity (0 : 1 : 0). The curve is both irreducible and smooth. The genus of this curve is given by the Plücker formula,

$$\mathbf{g} = \frac{(r-1)(r-2)}{2} = \mathbf{0}$$

where *r* = 1 is the degree of *f* (*x*, *y*,*z*). Clearly, the genus is zero since the curve is straight line and has no bends.

2. Let *D* = 5*P*∞, where *P*<sup>∞</sup> = (0: 1: 0) and *T* be the sum of all 8 affine points. The dimension of the Riemann–Roch space is then given by,

$$l(\mathfrak{S}P\_{\infty}) = \mathfrak{S} - 0 + 1 = 6$$

thus, the AG code has dimension *k* = 6.

3. The basis functions for the space *L*(5*P*∞) are

$$\{t\_1, \dots, t\_k\} = \left\{1, \frac{\mathbf{y}}{z}, \frac{\mathbf{y}^2}{z^2}, \frac{\mathbf{y}^3}{z^3}, \frac{\mathbf{y}^4}{z^4}, \frac{\mathbf{y}^5}{z^5}\right\}$$

By examining the basis, it is clear that *t*<sup>1</sup> = 1 has no poles, thus, (*t*1) + *D* has no poles also. Basis functions with denominator *z* have (*t*1) = *S* − *P*<sup>∞</sup> where *S* = (0 : 0 : 1) is a divisor of the numerator. The denominator polynomial *z* evaluates to zero at the point at infinity *P*<sup>∞</sup> of the divisor *D*, thus, (*t*1) + *D* has no poles. Basis functions with denominator *z*<sup>2</sup> have (*t*2) = *S* − 2*P*<sup>∞</sup> where *S* = 2 × (0 : 0 : 1) is a divisor of the numerator. The denominator polynomial *z*<sup>2</sup> evaluates to zero at the point at infinity *P*<sup>∞</sup> of the divisor *D* with multiplicity 2, thus, (*t*2) + *D* has no poles. Basis functions with denominator *z*<sup>3</sup> have (*t*3) = *S* − 3*P*<sup>∞</sup> where *S* = 3 × (0 : 0 : 1) is a divisor of the numerator. Thus, (*t*3) + *D* also has no poles. And so on.

4. The generator matrix of the code defined with divisor *D* = 5*P*<sup>∞</sup> is thus,

$$\begin{aligned} G &= \begin{bmatrix} t\_1(P\_1) & \cdots & t\_1(P\_n) \\ \vdots & \ddots & \vdots \\ t\_k(P\_1) & \cdots & t\_k(P\_n) \end{bmatrix} \\ &= \begin{bmatrix} 1 & 1 & 1 & 1 & 1 & 1 & 1 \\ 0 & \alpha^2 \alpha^3 \alpha^4 \alpha^5 \alpha^6 & 1 & 1 & 1 \\ 0 \ \alpha^2 \alpha^4 \alpha^6 \alpha & \alpha^3 \alpha^5 & 1 \\ 0 \ \alpha^3 \ \alpha^6 \alpha^2 \ \alpha^5 \ \alpha & \alpha^4 & 1 \\ 0 \ \alpha^4 \ \alpha \ \alpha^5 \ \alpha^2 \ \alpha^6 \ \alpha^3 & 1 \\ 0 \ \alpha^5 \ \alpha^3 \ \alpha \ \alpha^6 \ \alpha^4 \ \alpha^2 & 1 \end{bmatrix} \end{aligned}$$

Clearly, this is a generator matrix of an extended Reed–Solomon code with parameters [3, 6, 8]8.

**Theorem 8.4** (From [2]) *The minimum distance of an AG code is given by,*

$$d \ge n - degree(D)$$

Thus, the Hermitian code defined by *D* = 5*P*<sup>∞</sup> is a [8, 5, 3]<sup>4</sup> code. The dual of an AG code has parameters [17],

$$\text{Dimensions, } k^{\perp} = n - degree(D) + \text{g} - 1$$

$$\text{Distance, } d^{\perp} \ge degree(D) - 2\text{g} + 2$$

#### **8.4 Generalised AG Codes**

Algebraic geometry codes and codes obtained from them feature prominently in the databases of best-known codes [8, 15] for an appreciable range of code lengths for different field sizes *q*. Generalised algebraic geometry codes were first presented by Niederreiter et al. [21], Xing et al. [13]. A subsequent paper by Ozbudak and Stichtenoth [14] shed more light on the construction. AG codes as defined by Goppa utilised places of degree one or rational places. Generalised AG codes however were constructed by Xing et al. using places of higher degree (including places of degree one). In [20], the authors presented a method of constructing generalised AG codes which uses a concatenation concept. The paper showed that best-known codes were obtainable via this construction. In [4] it was shown that the method can be effective in constructing new codes and the authors presented 59 codes in finite fields F4, F<sup>8</sup> and F<sup>9</sup> better than the codes in [8]. In [11], the authors presented a construction method based on [20] that uses a subfield image concept and obtained new binary codes as a result. In [19] the authors presented some new curves as well as 129 new codes in F<sup>8</sup> and F9.

#### *8.4.1 Concept of Places of Higher Degree*

Recall from Chap. 8 that a two-dimensional affine space A<sup>2</sup>(F*<sup>q</sup>* ) is given by the set of points

$$\{ (\alpha, \beta) : \alpha, \beta \in \mathbb{F}\_q \}$$

while its projective closure P<sup>2</sup>(F*<sup>q</sup>* ) is given by the set of equivalence points

$$\{ ((\alpha : \beta : 1)) \cup \{ (\alpha : 1 : 0) \} \cup \{ (1 : 0 : 0) \} : \alpha, \beta \in \mathbb{F}\_q \}.$$

Given a homogeneous polynomial *F*(*x*, *y*,*z*), a curve *X* /F*<sup>q</sup>* defined in P<sup>2</sup>(F*<sup>q</sup>* ) is a set of distinct points

$$\mathcal{X}'/\mathbb{F}\_q = \{ T \in \mathbb{P}^2(\mathbb{F}\_q) \, : \, F(T) = 0 \}$$

Let F*q* be an extension of the field F*<sup>q</sup>* , the Frobenius automorphism is given as

$$\begin{aligned} \phi\_{q,\ell} &: \mathbb{F}\_{q^{\ell}} \to \mathbb{F}\_{q^{\ell}} \\ \phi\_{q,\ell}(\beta) &= \beta^{q} \qquad \beta \in \mathbb{F}\_{q^{\ell}} \end{aligned}$$

and its action on a projective point (*<sup>x</sup>* : *<sup>y</sup>* : *<sup>z</sup>*) in <sup>F</sup>*q* is

$$(\phi\_{q,\ell}((\mathfrak{x}:\mathfrak{y}:\mathfrak{z})) = (\mathfrak{x}^q:\mathfrak{y}^q:\mathfrak{z}^q).$$

**Definition 8.12** (*Place of Degree from* [18]) A place of degree is a set of points of a curve defined in the extension field <sup>F</sup>*q* denoted by {*T*0, *<sup>T</sup>*1,..., *<sup>T</sup>*−1} where each *Ti* <sup>=</sup> <sup>φ</sup>*<sup>i</sup> <sup>q</sup>*,*<sup>l</sup>*(*T*0). Places of degree one are called rational places.

*Example 8.7* Consider the curve in F<sup>4</sup> defined as,

$$F(\mathbf{x}, \mathbf{y}, z) = \mathbf{x}$$

The curve has the following projective rational points (points of degree 1),

 $(0:0:1)$   $(0:1:1)$   $(0:\alpha:1)$   $(0:\alpha^2:1)$   $(0:1:0)$ 

where α is the primitive polynomial of F4. The curve has the following places of degree 2,

$$\begin{array}{ll} \{ (0:\beta:1), (0:\beta^4:1) \} & \{ (0:\beta^2:1), (0:\beta^8:1) \} \\ \{ (0:\beta^3:1), (0:\beta^{12}:1) \} & \{ (0:\beta^6:1), (0:\beta^9:1) \} \\ \{ (0:\beta^7:1), (0:\beta^{13}:1) \} & \{ (0:\beta^{11}:1), (0:\beta^{14}:1) \} \end{array}$$

where β is the primitive element of F16.

#### *8.4.2 Generalised Construction*

This section gives details of the construction of generalised AG codes as described in [21]. Two maps that are useful in the construction of generalised AG codes are now described. Observe that <sup>F</sup>*<sup>q</sup>* is a subfield of <sup>F</sup>*q* for all <sup>≥</sup> 2. It is then possible to map F*q* to an -dimensional vector space with elements from F*<sup>q</sup>* using a suitable basis. The map π is defined as such,

$$\begin{aligned} \pi\_{\ell} &: \mathbb{F}\_{q^{\ell}} \to \mathbb{F}\_{q}^{\ell} \\ \pi\_{\ell}(\beta) &= (c\_{1}c\_{2}\dots c\_{\ell}) \qquad \beta \in \mathbb{F}\_{q^{\ell}}, \ c\_{i} \in \mathbb{F}\_{q}. \end{aligned}$$

Suppose (γ1, γ2,...,γ) forms a suitable basis of the vector space F *<sup>q</sup>* , then β = *c*1γ<sup>1</sup> + *c*2γ<sup>2</sup> +···+ *c*γ. Finally, the map σ,*<sup>n</sup>* is used to represent an encoding map from an -dimensional message space in F*<sup>q</sup>* to an *n*-dimensional code space,

$$
\sigma\_{\ell,n} : \mathbb{F}\_q^\ell \to \mathbb{F}\_q^n
$$

with ≤ *n*.

A description of generalised AG codes as presented in [4, 13, 21] is now presented. Let *<sup>F</sup>* <sup>=</sup> *<sup>F</sup>*(*x*, *<sup>y</sup>*,*z*) be a homogeneous polynomial defined in <sup>F</sup>*<sup>q</sup>* . Let *<sup>g</sup>* be the genus of a smooth irreducible curve *X* /F*<sup>q</sup>* corresponding to the polynomial *F*. Also, let *<sup>P</sup>*1, *<sup>P</sup>*2,..., *Pr* be *<sup>r</sup>* distinct places of *<sup>X</sup>* /F*<sup>q</sup>* and *ki* <sup>=</sup> *deg*(*Pi*) (*deg* is degree of). *W* is a divisor of the curve *X* /F*<sup>q</sup>* such that

$$W = P\_1 + P\_2 + \dots + P\_r$$

and another divisor *G* such that the two do not intersect.<sup>4</sup> Specifically, the divisor *G* = *m*(*Q* − *R*) where *deg*(*Q*) = *deg*(*R*) + 1 for arbitrary5 divisors *Q* and *R*. As mentioned earlier, associated with the divisor *G* is a Riemann–Roch space *L* (*G*) with *m* = *deg*(*G*)) an integer, *m* ≥ 0. From the Riemann–Roch theorem (Theorem 8.3) it is known that the dimension of *L* (*G*) is given by *l*(*G*) and

$$l(G) \ge m - g + 1.$$

Also, associated with each *Pi* is a *q*-ary code *Ci* with parameters [*ni*, *ki* = *deg*(*Pi*), *di*]*<sup>q</sup>* with the restriction that *di* ≤ *ki* . Let

$$\{f\_1, f\_2, \ldots, f\_k : f\_l \in \mathcal{A}'(G)\}$$

denote a set of *k* linearly independent elements of*L* (*G*)that form a basis. A generator matrix for a generalised AG code is given as such,

$$M = \begin{bmatrix} \sigma\_{k\_1, n\_1}(\pi\_{k\_1}(f\_1(P\_1))) \cdot \cdots \cdot \sigma\_{k\_r, n\_r}(\pi\_{k\_r}(f\_1(P\_r))) \\ \sigma\_{k\_1, n\_1}(\pi\_{k\_1}(f\_2(P\_1))) \cdot \cdots \cdot \sigma\_{k\_r, n\_r}(\pi\_{k\_r}(f\_2(P\_r))) \\ \vdots \\ \vdots \\ \sigma\_{k\_1, n\_1}(\pi\_{k\_1}(f\_k(P\_1))) \cdot \cdots \cdot \sigma\_{k\_r, n\_r}(\pi\_{k\_r}(f\_k(P\_r))) \end{bmatrix}$$

<sup>4</sup>This is consistent with the definition of AG codes. The two divisors should have no points in common.

<sup>5</sup>These are randomly chosen places such that the difference between their degrees is 1 and *G* does not intersect *W*.

where *fl*(*Pi*) is an evaluation of a polynomial and basis element *fl* at a place *Pi* , π*ki* is a mapping from F*qki* to F*<sup>q</sup>* and σ*ki*,*ni* is the encoding of a message vector in F*ki q* to a code vector in F*ni <sup>q</sup>* . This is a 3 step process. The place *Pi* is first evaluated at *fl* resulting in an element of F*ki <sup>q</sup>* . The result is then mapped to a vector of length *ki* in the subfield <sup>F</sup>*<sup>q</sup>* . Finally, this vector is encoded with code with parameters [*ni*, *ki*, *di*]*<sup>q</sup>* .

It is desirable to choose the maximum possible minimum distance for all codes *Ci* so that *di* = *ki* [21]. The same code is used in the map σ*ki*,*ni* for all points of the same degree *ki* , i.e. the code *Cj* has parameters [*n <sup>j</sup>*, *j*, *dj*]*<sup>q</sup>* for a place of degree *j*. Let *Aj* be an integer denoting the number of places of degree *j* and *Bj* be an integer such that 0 ≤ *Bj* ≤ *Aj* .

If *t* is the maximum degree of any place *Pi* that is chosen in the construction, then the generalised AG code is represented as a

$$C\_1(k; t; B\_1, B\_2, \dots, B\_t; d\_1, d\_2, \dots, d\_t).$$

Let [*n*, *<sup>k</sup>*, *<sup>d</sup>*]*<sup>q</sup>* represent a linear code in <sup>F</sup>*<sup>q</sup>* with length *<sup>n</sup>*, dimension *<sup>k</sup>*, and minimum distance *d*, then a generalised AG code is given by the parameters [21],

$$\begin{aligned} k &= l(G) \ge m - g + 1\\ n &= \sum\_{i=1}^r n\_i = \sum\_{j=1}^t B\_j n\_j\\ d &\ge \sum\_{i=1}^r d\_i - g - k + 1 = \sum\_{j=1}^t B\_j d\_j - g - k + 1. \end{aligned}$$

Below are two examples showing the construction of generalised AG codes.

*Example 8.8* Let *<sup>F</sup>*(*x*, *<sup>y</sup>*,*z*) <sup>=</sup> *<sup>x</sup>* <sup>3</sup> <sup>+</sup> *xyz* <sup>+</sup> *x z*<sup>2</sup> <sup>+</sup> *<sup>y</sup>*2*<sup>z</sup>* [21] be a polynomial in <sup>F</sup>2. The curve *<sup>X</sup>* /F<sup>2</sup> has genus *<sup>g</sup>* <sup>=</sup> 1 and *<sup>A</sup>*<sup>1</sup> <sup>=</sup> 4 places of degree 1 and *<sup>A</sup>*<sup>2</sup> <sup>=</sup> 2 places of degree 2.

Table 8.4 gives the places of *X* /F<sup>2</sup> up degree 2. The field F<sup>22</sup> is defined by a primitive polynomial *s*<sup>2</sup> + *s* + 1 with α as its primitive element. Points

$$\mathcal{R} = (1:a^3 + a^2:1)$$

as a place of degree 4 and

$$\mathcal{Q} = (1:b^4 + b^3 + b^2:1)$$

as a place of degree 5 are also chosen arbitrarily while *a* and *b* are primitive elements of <sup>F</sup><sup>24</sup> (defined by the polynomial *<sup>s</sup>*<sup>4</sup> <sup>+</sup> *<sup>s</sup>*<sup>3</sup> <sup>+</sup> *<sup>s</sup>*<sup>2</sup> <sup>+</sup> *<sup>s</sup>* <sup>+</sup> 1) and <sup>F</sup><sup>25</sup> (defined by the polynomial *s*<sup>5</sup> + *s*<sup>2</sup> + 1),g respectively. The divisor *W* is

$$W = P\_1 + \dots + P\_6.$$


The basis of the Riemann–Roch space *L* (2*D*) with *D* = *Q* − *R* and *m* = 2 is obtained with computer algebra software MAGMA [3] as,

$$\begin{aligned} f\_1 &= (\mathbf{x}^\top + \mathbf{x}^3 + \mathbf{x}) / (\mathbf{x}^{10} + \mathbf{x}^4 + 1) \mathbf{y} \\ &+ (\mathbf{x}^{10} + \mathbf{x}^9 + \mathbf{x}^7 + \mathbf{x}^6 + \mathbf{x}^5 + \mathbf{x} + 1) / (\mathbf{x}^{10} + \mathbf{x}^4 + 1) \\ f\_2 &= (\mathbf{x}^8 + \mathbf{x}^7 + \mathbf{x}^4 + \mathbf{x}^3 + \mathbf{x} + 1) / (\mathbf{x}^{10} + \mathbf{x}^4 + 1) \mathbf{y} \\ &+ (\mathbf{x}^8 + \mathbf{x}^4 + \mathbf{x}^2) / (\mathbf{x}^{10} + \mathbf{x}^4 + 1) \end{aligned}$$

For the map σ*ki*,*ni* the codes; *c*<sup>1</sup> a [1, 1, 1]<sup>2</sup> cyclic code for places of degree 1 and *c*<sup>2</sup> a [3, 2, 2]<sup>2</sup> cyclic code places of degree 2 are used. For the map π<sup>2</sup> which applies to places of degree 2, a polynomial basis [γ1, γ2]=[1, α] is used. Only the first point in the place *Pi* for *deg*(*Pi*) = 2 in the evaluation of *f*<sup>1</sup> and *f*<sup>2</sup> at *Pi* is utilised. The generator matrix *<sup>M</sup>* of the resulting [10, <sup>2</sup>, <sup>6</sup>]<sup>2</sup> generalised AG code over <sup>F</sup><sup>2</sup> is,

$$M = \begin{bmatrix} 1 & 1 & 0 \ 1 & 0 \ 1 & 1 \ 0 \ 1 & 1 \\ 0 \ 0 & 1 \ 1 & 1 \ 1 \ 0 & 1 \ 0 \ 1 \end{bmatrix}$$

*Example 8.9* Consider again the polynomial

$$F(\mathbf{x}, \mathbf{y}, z) = \mathbf{x}^3 + \mathbf{x}\mathbf{y}z + \mathbf{x}z^2 + \mathbf{y}^2 z^3$$

with coefficients from F<sup>2</sup> whose curve (with genus equal to 1) has places up to degree 2 as in Table 8.4. An element *f* of the Riemann–Roch space defined by the divisor *G* = (*R* − *Q*) with

$$\mathcal{Q} = (a:a^3 + a^2:1)$$

and

$$R = (b:b^4+b^3+b^2+b+1:1)$$

where *a* and *b* primitive elements of F<sup>24</sup> and F<sup>25</sup> (since the curve has no place of degree 3) respectively, is given by,

$$\begin{aligned} f &= (\mathbf{x}^3 \mathbf{x} + \mathbf{x}^2 z^2 + z^4) \mathbf{y} / (\mathbf{x}^5 + \mathbf{x}^3 z^2 + z^5) \\ &+ (\mathbf{x}^5 + \mathbf{x}^4 z + \mathbf{x}^3 z^2 + z^3 \mathbf{x}^2 + \mathbf{x} z^4 + z^5) / (\mathbf{x}^5 + \mathbf{x}^3 z^2 + z^5) \end{aligned}$$

Evaluating *f* at all the 5 places *Pi* from the Table 8.4 and using the map πdeg(*Pi*) that maps all evaluations to F<sup>2</sup> results in,

$$\underbrace{f(P\_i)\mid\_{\mathsf{deg}(P\_i)=1}}\_{\mathsf{I}} = \underbrace{\mathsf{l}\mid\,\alpha^2}\_{f(P\_i)\mid\_{\mathsf{deg}(P\_i)=2}}\underbrace{\mathsf{l}}\_{\mid\,\mathsf{deg}(P\_i)=2}$$

This forms the code [6, 1, 5]4. <sup>6</sup> In F<sup>2</sup> this becomes,

$$[1|1|1|0|1|1|\underbrace{1|0}\_{1}|\underbrace{1|1|}\_{\alpha^2}]$$

which forms the code [8, 1, 5]2. Short auxiliary codes [1, 1, 1]<sup>2</sup> to encode *f* (*Pi*) |deg(*Pi*)=<sup>1</sup> and [3, 2, 2]<sup>2</sup> to encode *f* (*Pi*) |deg(*Pi*)=<sup>2</sup> are used. The resulting codeword of a generalised AG code is,

$$\{ \ | \ 1 \ | \ 1 \ | \ 0 \ | \ 1 \ | \ 1 \ 0 \ | \ 1 \ | \ 1 \ 1 \ 0 \ | \}.$$

This forms the code [10, 1, 7]2.

Three polynomials and their associated curves are used to obtain codes in F<sup>16</sup> better than the best-known codes in [15]. The three polynomials are given in Table 8.5, while Table 8.6 gives a summary of the properties of their associated curves (with *<sup>t</sup>* <sup>=</sup> 4). <sup>w</sup> is the primitive element of <sup>F</sup>16. The number of places of degree *<sup>j</sup>*, *Aj* , is determined by computer algebra system MAGMA [3]. The best-known linear codes from [15] over <sup>F</sup><sup>16</sup> with *<sup>j</sup>* <sup>=</sup> *dj* for 1 <sup>≤</sup> *<sup>j</sup>* <sup>≤</sup> 4 are

$$\{1,1,1\}\_{16} \quad [\mathfrak{Z}, \mathfrak{Z}, \mathfrak{Z}]\_{16} \quad [\mathfrak{Z}, \mathfrak{Z}, \mathfrak{Z}]\_{16} \quad [\mathfrak{T}, \mathfrak{4}, \mathfrak{4}]\_{16}$$

which correspond to *C*1, *C*2, *C*<sup>3</sup> and *C*4, respectively. Since *t* = 4 for all the codes in this paper and

$$[d\_1, d\_2, d\_3, d\_4] = [1, 2, 3, 4]$$

The representation *C*1(*k*;*t*; *B*1, *B*2,..., *Bt*; *d*1, *d*2,..., *dt*) is shortened as such,

$$C\_1(k; t; B\_1, B\_2, \dots, B\_t; d\_1, d\_2, \dots, d\_t) \equiv C\_1(k; B\_1, B\_2, \dots, B\_t).$$

Tables 8.7 to 8.9 show improved codes from generalised AG codes with better minimum distance than codes in [15]. It is also worth noting that codes of the form

<sup>6</sup>From Bezout's *dmin* <sup>=</sup> *<sup>n</sup>* <sup>−</sup> *<sup>m</sup>* <sup>=</sup> *<sup>n</sup>* <sup>−</sup> *<sup>k</sup>* <sup>−</sup> *<sup>g</sup>* <sup>+</sup> 1.


**Table 8.5** Polynomials in F<sup>16</sup>

**Table 8.6** Properties of *X<sup>i</sup>* /F<sup>16</sup>


**Table 8.7** New codes from *X*1/F<sup>16</sup>


**Table 8.8** New codes from *X*2/F<sup>16</sup>


*C*1(*k*; *N*, 0, 0, 0) are simply Goppa codes (defined with only rational points). The symbol # in the Tables 8.7 to 8.9 denotes the number of new codes from each generalised AG code *C*1(*k*; *B*1, *B*2,..., *Bt*). The tables in [7] contain curves known to have the most number of rational points for a given genus. The curve *X*2/F<sup>16</sup> is defined by the well-known Hermitian polynomial [5].

**Table 8.9** New codes from *X*3/F<sup>16</sup>


## **8.5 Summary**

Algebraic geometry codes are codes obtained from curves. First, the motivation for studying these codes was given. From an asymptotic point of view, some families of AG codes have superior performance than the previous best known bound on the performance of linear codes, the Gilbert–Varshamov bound. For codes of moderate length, AG codes have better minimum distances than their main competitors, nonbinary BCH codes with the same rate defined in the same finite fields. Theorems and definitions as a precursor to AG codes was given. Key theorems are Bezout's and Riemann–Roch. Examples using the well-known Hermitian code in a finite field of cardinality 4 were then discussed. The concept of place of higher degrees of curves was presented. This notion was used in the construction of generalised AG codes.

# **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the book's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the book's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Chapter 9 Algebraic Quasi Cyclic Codes**

#### **9.1 Introduction**

Binary self-dual codes have an interesting structure and some are known to have the best possible minimum Hamming distance of any known codes. Closely related to the self-dual codes are the double-circulant codes. Many good binary self-dual codes can be constructed in double-circulant form. Double-circulant codes as a class of codes have been the subject of a great deal of attention, probably because they include codes, or the equivalent codes, of some of the most powerful and efficient codes known to date. An interesting family of binary, double-circulant codes, which includes self-dual and formally self-dual codes, is the family of codes based on primes. A classic paper for this family was published by Karlin [9] in which doublecirculant codes based on primes congruent to ±1 and ±3 modulo 8 were considered. Self-dual codes are an important category of codes because there are bounds on their minimal distance [**?** ]. The possibilities for their weight spectrum are constrained, and known ahead of the discovery, and analysis of the codes themselves. This has created a great deal of excitement among researchers in the rush to be the first in finding some of these codes. A paper summarising the state of knowledge of these codes was given by Dougherty et al. [1]. Advances in high-speed digital processors now make it feasible to implement near maximum likelihood, soft decision decoders for these codes and thus, make it possible to approach the predictions for frame error rate (FER) performance for the additive white Gaussian noise channel made by Claude Shannon back in 1959 [16].

This chapter considers the binary double-circulant codes based on primes, especially in analysis of their Hamming weight distributions. Section 9.2 introduces the notation used to describe double-circulant codes and gives a review of doublecirculant codes based on primes congruent to ±1 and ±3 modulo 8. Section 9.4 describes the construction of double-circulant codes for these primes and Sect. 9.5 presents an improved algorithm to compute the minimum Hamming distance and also the number of codewords of a given Hamming weight for certain double-circulant codes. The algorithm presented in this section requires the enumeration of less codewords than that of the commonly used technique [4, 18] e.g. Sect. 9.6 considers the Hamming weight distribution of the double-circulant codes based on primes. A method to provide an independent verification to the number of codewords of a given Hamming weight in these double-circulant codes is also discussed in this section. In the last section of this chapter, Sect. 9.7, a probabilistic method−based on its automorphism group, to determine the minimum Hamming distance of these double-circulant codes is described.

Note that, as we consider Hamming space only in this chapter, we shall omit the word "Hamming" when we refer to Hamming weight and distance.

#### **9.2 Background and Notation**

A code *C* is called *self-dual* if,

$$\llcorner = \llcorner$$

where *C* <sup>⊥</sup> is the dual of *C* . There are two types of self-dual code: *doubly even* or Type-II for which the weight of every codeword is divisible by 4; *singly even* or Type-I for which the weight of every codeword is divisible by 2. Furthermore, the code length of a Type-II code is divisible by 8. On the other hand, formally self-dual (FSD) codes are codes that have

*C* = *C* <sup>⊥</sup>,

but satisfy *A<sup>C</sup>* (*z*) = *A<sup>C</sup>*<sup>⊥</sup> (*z*), where *A*(*C* ) denotes the weight distribution of the code *C* . A self-dual, or FSD, code is called *extremal* if its minimum distance is the highest possible given its parameters. The bound of the minimum distance of the extremal codes is [15]

$$d \le 4\left\lfloor \frac{n}{24} \right\rfloor + 4 + \varepsilon,\tag{9.1}$$

where

$$\varepsilon = \begin{cases} -2 & \text{if } \theta' \text{ is Type-I with } n = 2, 4, \text{ or } 6, \\ 2, & \text{if } \theta' \text{ is Type-I with } n \equiv 22 \pmod{24}, \text{ or } \\ 0, & \text{if } \theta' \text{ is Type-I or Type-II with } n \not\equiv 22 \pmod{24}. \end{cases} \tag{9.2}$$

for an extremal FSD code with length *n* and minimum distance *d*. For an FSD code, the minimum distance of the extremal case is upper bounded by [4]

$$d \le 2\left\lfloor \frac{n}{8} \right\rfloor + 2.\tag{9.3}$$

As a consequence of this upper bound, extremal FSD codes are known to only exist for lengths *n* ≤ 30 and *n* = 16 and *n* = 26 [7]. Databases of best-known, not necessary extremal, self-dual codes are given in [3, 15]. A table of binary self-dual double-circulant codes is also provided in [15].

As a class, double-circulant codes are (*n*, *k*) linear codes, where *k* = *n*/2, whose generator matrix *G* consists of two circulant matrices.

**Definition 9.1** (*Circulant Matrix*) A circulant matrix is a square matrix in which each row is a cyclic shift of the adjacent row. In addition, each column is also a cyclic shift of the adjacent column and the number of non-zeros per column is equal to those per row.

A circulant matrix is completely characterised by a polynomial formed by its first row

$$r(\mathbf{x}) = \sum\_{i=0}^{m-1} r\_i \mathbf{x}^i,$$

which is called the *defining polynomial*.

Note that the algebra of polynomials modulo *x<sup>m</sup>* − 1 is isomorphic to that of circulants [13]. Let the polynomial *r*(*x*) have a maximum degree of *m*, and the corresponding circulant matrix *R* is an *m* × *m* square matrix of the form

$$\mathbf{R} = \begin{bmatrix} r(\boldsymbol{\kappa}) \pmod{\boldsymbol{x}^m - 1} \\ \boldsymbol{x}r(\boldsymbol{\kappa}) \pmod{\boldsymbol{x}^m - 1} \\ \vdots \\ \boldsymbol{x}^ir(\boldsymbol{\kappa}) \pmod{\boldsymbol{x}^m - 1} \\ \vdots \\ \boldsymbol{x}^{m-1}r(\boldsymbol{\kappa}) \pmod{\boldsymbol{x}^m - 1} \end{bmatrix} \tag{9.4}$$

where the polynomial in each row can be represented by an *m*-dimensional vector, which contains the coefficients of the corresponding polynomial.

#### *9.2.1 Description of Double-Circulant Codes*

A double-circulant binary code is an (*n*, *<sup>n</sup>* <sup>2</sup> ) code in which the generator matrix is defined by two circulant matrices, each matrix being *<sup>n</sup>* <sup>2</sup> by *<sup>n</sup>* <sup>2</sup> bits. Circulant consists of cyclically shifted rows, modulo *<sup>n</sup>* <sup>2</sup> , of a generator polynomial. These generator polynomials are defined as *r*1(*x*) and *r*2(*x*). Each codeword consists of two parts: the information data, defined as *u*(*x*), convolved with *r*1(*x*) modulo (1 <sup>+</sup> *<sup>x</sup> <sup>n</sup>* <sup>2</sup> ) adjoined with *<sup>u</sup>*(*x*) and convolved with *<sup>r</sup>*2(*x*) modulo (1 <sup>+</sup> *<sup>x</sup> <sup>n</sup>* <sup>2</sup> ). The code is the same as a non-systematic, tail-biting convolutional code of rate one half. Each codeword is [*u*(*x*)*r*1(*x*), *u*(*x*)*r*2(*x*)]. If *r*1(*x*) [or *r*2(*x*)] has no common factors of (1 <sup>+</sup> *<sup>x</sup> <sup>n</sup>* <sup>2</sup> ), then the respective circulant matrix is non-singular and may be inverted. The inverted circulant matrix becomes an identity matrix, and each codeword is defined by *<sup>u</sup>*(*x*), *<sup>u</sup>*(*x*)*r*(*x*), where *<sup>r</sup>*(*x*) <sup>=</sup> *<sup>r</sup>*1(*x*) *<sup>r</sup>*2(*x*) modulo (<sup>1</sup> <sup>+</sup> *<sup>x</sup> <sup>n</sup>* <sup>2</sup> ), [or *<sup>r</sup>*(*x*) <sup>=</sup> *<sup>r</sup>*2(*x*) *<sup>r</sup>*1(*x*) modulo (<sup>1</sup> <sup>+</sup> *<sup>x</sup> <sup>n</sup>* <sup>2</sup> ), respectively]. The code is now the same as a systematic, tail-biting convolutional code of rate one half.

For double-circulant codes where one circulant matrix is non-singular and may be inverted, the codes can be put into two classes, namely *pure*, and *bordered* doublecirculant codes, whose generator matrices *G<sup>p</sup>* and *G<sup>b</sup>* are shown in (9.5a)

$$\mathbf{G}\_p = \begin{array}{c|cc} \\ & I\_k \\ & \\ & \end{array} \quad \text{(R)}\tag{9.5a}$$

and (9.5b),

$$\mathbf{G}\_b = \begin{bmatrix} \hline & \begin{matrix} 1 \ \dots \ 1 \end{matrix} \begin{matrix} \alpha \\ \hline \end{matrix} \\\ I\_k & \begin{matrix} \mathbf{R} \\ \hline \end{matrix} \\\hline \end{bmatrix} \\\tag{9.5b}$$

respectively. Here, *I <sup>k</sup>* is a *k*-dimensional identity matrix, and α ∈ {0, 1}.

**Definition 9.2** (*Quadratic Residues*) Let α be a generator of the finite field F*p*, where *p* be an odd prime,*r* ≡ α<sup>2</sup> (mod *p*) is called a quadratic residue modulo *p* and so is *<sup>r</sup><sup>i</sup>* <sup>∈</sup> <sup>F</sup>*<sup>p</sup>* for some integer *<sup>i</sup>*. Because the element <sup>α</sup> has (multiplicative) order *<sup>p</sup>* <sup>−</sup> <sup>1</sup> over <sup>F</sup>*p*, *<sup>r</sup>* <sup>=</sup> <sup>α</sup><sup>2</sup> has order <sup>1</sup> <sup>2</sup> (*p* − 1). A set of quadratic residues modulo *p*, *Q* and non-quadratic residues modulo *p*, *N*, are defined as

$$\begin{split} \mathcal{Q} &= \{r, r^2, \dots, r^i, \dots, r^{\frac{p-3}{2}}, r^{\frac{p-1}{2}} = 1\} \\ &= \{\alpha^2, \alpha^4, \dots, \alpha^{2i}, \dots, \alpha^{p-3}, \alpha^{p-1} = 1\} \end{split} \tag{9.6a}$$

and

$$\begin{aligned} N &= \{ n : \forall n \in \mathbb{F}\_p, \ n \neq \mathcal{Q} \text{ and } n \neq 0 \} \\ &= \{ nr, nr^2, \dots, nr^i, \dots, nr^{\frac{p-3}{2}}, n \} \\ &= \{ \alpha^{2i+1} \; : \; 0 \le i \le \frac{p-3}{2} \} \end{aligned} \tag{9.6b}$$

respectively.

As such *<sup>R</sup>* <sup>∪</sup> *<sup>Q</sup>* ∪ {0} = <sup>F</sup>*p*. It can be seen from the definition of *<sup>Q</sup>* and *<sup>N</sup>* that, if *r* ∈ *Q*, *r* = α*<sup>e</sup>* for even *e*; and if *n* ∈ *N*, *n* = α*<sup>e</sup>* for odd *e*. Hence, if *n* ∈ *N* and *r* ∈ *Q*, *r n* = α2*<sup>i</sup>* α<sup>2</sup> *<sup>j</sup>*+<sup>1</sup> = α2(*i*<sup>+</sup> *<sup>j</sup>*)+<sup>1</sup> ∈ *N*. Similarly, *rr* = α2*<sup>i</sup>* α<sup>2</sup> *<sup>j</sup>* = α2(*i*<sup>+</sup> *<sup>j</sup>*) ∈ *Q* and *nn* = α2*i*+1α<sup>2</sup> *<sup>j</sup>*+<sup>1</sup> = α2(*i*<sup>+</sup> *<sup>j</sup>*+1) ∈ *Q*. Furthermore,


#### **9.3 Good Double-Circulant Codes**

# *9.3.1 Circulants Based Upon Prime Numbers Congruent to* **±***3 Modulo 8*

An important category is circulants whose length is equal to a prime number, *p*, which is congruent to ±3 modulo 8. For many of these prime numbers, there is only a single cyclotomic coset, apart from zero. In these cases, 1 + *x <sup>p</sup>* factorises into the product of two irreducible polynomials,(1 + *x*)(1 + *x* + *x* <sup>2</sup> + *x* <sup>3</sup> +···+ *x <sup>p</sup>*−<sup>1</sup>). Apart from the polynomial, (1 + *x* + *x* <sup>2</sup> + *x* <sup>3</sup> +···+ *x <sup>p</sup>*−<sup>1</sup>), all of the other 2*<sup>p</sup>* − 2 non-zero polynomials of degree less than p are in one of two sets: The set of 2*<sup>p</sup>*−<sup>1</sup> even weight, polynomials with 1 + *x* as a factor, denoted as **Sf**, and the set of 2*<sup>p</sup>*−<sup>1</sup> odd weight polynomials which are relatively prime to 1 + *x <sup>p</sup>*, denoted as **Sr**. The multiplicative order of each set is 2*<sup>p</sup>*−<sup>1</sup> − 1, and each forms a ring of polynomials modulo 1 + *x <sup>p</sup>*. Any non-zero polynomial apart from (1 + *x* + *x* <sup>2</sup> + *x* <sup>3</sup> +···+ *x <sup>p</sup>*−<sup>1</sup>) is equal to α(*x*)*<sup>i</sup>* for some integer *i* if the polynomial is in **Sf** or is equal to *a*(*x*)*<sup>i</sup>* for some integer i if in **Sr**. An example for *p* = 11 is given in Appendix "Circulant Analysis *p* = 11". In this table, α(*x*) = 1 + *x* + *x* <sup>2</sup> + *x* <sup>4</sup> and *a*(*x*) = 1 + *x* + *x* 3. For these primes, as the circulant length is equal to *p*, the generator polynomial *r*(*x*) can either contain 1 + *x* as a factor, or not contain 1 + *x* as a factor, or be equal to (1 + *x* + *x* <sup>2</sup> + *x* <sup>3</sup> +···+ *x <sup>p</sup>*−<sup>1</sup>). For the last case, this is not a good choice for r(x) as the minimum codeword weight is 2, which occurs when *u*(*x*) = 1 + *x*. In this case, *r*(*x*)*u*(*x*) = 1 + *x <sup>p</sup>* = 0 modulo 1 + *x <sup>p</sup>* and the codeword is [1 + *x*, 0], a weight of 2.

When *r*(*x*) is in the ring **Sf**, *u*(*x*)*r*(*x*) must also be in **Sf** and therefore, be of even weight, except when *u*(*x*) = (1 + *x* + *x* <sup>2</sup> + *x* <sup>3</sup> +···+ *x <sup>p</sup>*−<sup>1</sup>).

In this case *u*(*x*)*r*(*x*) = 0 modulo 1 + *x <sup>p</sup>* and the codeword is [1 + *x* + *x* <sup>2</sup> + *x* <sup>3</sup> +···+ *x <sup>p</sup>*−<sup>1</sup>, 0] of weight *p*.When *u*(*x*) has even weight, the resulting codewords are doubly even. When *u*(*x*) has odd weight, the resulting codewords consist of two parts, one with odd weight and the other with even weight. The net result is the codewords that always have odd weight. Thus, there are both even and odd weight codewords when *u*(*x*) is from **Sf**.

When *r*(*x*) is in the ring **Sr**, *u*(*x*)*r*(*x*) is always non-zero and is in **Sf** (even weight) only when *u*(*x*) has even weight, and the resulting codewords are doubly even. When *u*(*x*) has odd weight, *u*(*x*) = *a*(*x*)*<sup>j</sup>* and *u*(*x*)*r*(*x*) = *a*(*x*)*<sup>j</sup> a*(*x*)*<sup>i</sup>* = *a*(*x*)*<sup>i</sup>*<sup>+</sup> *<sup>j</sup>* and hence is in the ring **Sf** and has odd weight. The resulting codewords have even weight since they consist of two parts, each with odd weight. Thus, when *r*(*x*) is from **Sr** all of the codewords have even weight. Furthermore, since *r*(*x*) = *a*(*x*)*<sup>i</sup>* , *r*(*x*)*a*(*x*)2(*p*−1) <sup>−</sup>1−*<sup>i</sup>* <sup>=</sup> *<sup>a</sup>*(*x*)2(*p*−1) <sup>−</sup><sup>1</sup> <sup>=</sup> 1 and hence, the inverse of *<sup>r</sup>*(*x*), <sup>1</sup> *<sup>g</sup>*(*x*) <sup>=</sup> *<sup>a</sup>*(*x*)2(*p*−1) −1−*i* .

By constructing a table (or sampled table) of **Sr**, it is very straightforward to design non-singular double-circulant codes. The minimum codeword weight of the code *dmin* cannot exceed the weight of*r*(*x*) + 1. Hence, the weight of *a*(*x*)*<sup>i</sup>* needs to be at least *dmin* − 1 to be considered as a candidate for*r*(*x*). The weight of the inverse of*r*(*x*), *a*(*x*)<sup>2</sup>(*p*−1) <sup>−</sup>1−*<sup>i</sup>* also needs to be at least *dmin* − 1. For odd weight *u*(*x*) = *a*(*x*)*<sup>j</sup>* and *u*(*x*)*r*(*x*) = *a*(*x*)*<sup>j</sup> a*(*x*)*<sup>i</sup>* = *a*(*x*)(*<sup>j</sup>*+*i*) . Hence, the weight of *u*(*x*)*r*(*x*) can be found simply by looking up the weight of *a*(*x*)*<sup>i</sup>*<sup>+</sup> *<sup>j</sup>* from the table. Self-dual codes are those with <sup>1</sup> *<sup>r</sup>*(*x*) <sup>=</sup> *<sup>r</sup>*(*x*−<sup>1</sup>). With a single cyclotomic coset 2(*p*−1) <sup>2</sup> = −1, and it follows that *a*(*x*)<sup>2</sup> (*p*−1) <sup>2</sup> = *a*(*x*−<sup>1</sup>). With *r*(*x*) = *a*(*x*)*<sup>i</sup>* , *r*(*x*−<sup>1</sup>) = *a*(*x*)<sup>2</sup> (*p*−1) 2 *i* . In order that <sup>1</sup>

*<sup>r</sup>*(*x*) = *r*(*x*−<sup>1</sup>), it is necessary that

$$a(\mathbf{x})^{2^{(p-l)}-1-i} = a(\mathbf{x})^{2^{\frac{(p-l)}{2}}i}.\tag{9.7}$$

Equating the exponents, modulo 2(*p*−1) − 1, gives

$$2^{\frac{(p-1)}{2}}i = m(2^{(p-1)} - 1) - i,\tag{9.8}$$

where m is an integer. Solving for i:

$$i = \frac{m(2^{(p-1)} - 1)}{(2^{\frac{(p-1)}{2}} + 1)}.\tag{9.9}$$

Hence, the number of distinct self-dual codes is equal to (2 (*p*−1) <sup>2</sup> + 1).

For the example, *p* = 13 as above,

$$i = m \frac{2^{(p-1)} - 1}{2^{\frac{(p-1)}{2}} + 1} = m \frac{4095}{65} = 63 m$$

and there are 2(*p*−1) <sup>2</sup> + 1 = 65 self-dual codes for 1 ≤ *j* ≤ 65 and these are *a*(*x*)63, *a*(*x*)126, *a*(*x*)<sup>189</sup>,..., *a*(*x*)4095.

As *<sup>p</sup>* is congruent to <sup>±</sup>3, the set (*u*(*x*)*r*(*x*))<sup>2</sup>*<sup>t</sup>* maps to (*u*(*x*)*r*(*x*)) for *t* = 1 → *r*, where *r* is the size of the cyclotomic cosets of 2(*p*−1) <sup>2</sup> + 1. In the case of *p* = 13 above, there are 4 cyclotomic cosets of 65, three of length 10 and one of length 2. This implies that there on 4 non-equivalent self-dual codes.

For *p* congruent to −3 modulo 8,(2 (*p*−1) <sup>2</sup> + 1)is not divisible by 3. This means that the pure double-circulant quadratic residue code is not self-dual. Since the quadratic residue code has multiplicative order 3, this means that for *p* congruent to −3 modulo 8, the quadratic residue, pure double-circulant code is self-orthogonal, and *r*(*x*) = *r*(*x*−1).

For *p* congruent to 3, (2 (*p*−1) <sup>2</sup> + 1) is divisible by 3 and the pure double-circulant quadratic residue code is self-dual. In this case, *a*(*x*) has multiplicative order of 2(*p*−1) − 1, and *a*(*x*) (2(*p*−1)−1) <sup>3</sup> must have exponents equal to the quadratic residues (or non-residues). The inverse polynomial is *a*(*x*) <sup>2</sup>(2(*p*−1)−1) <sup>3</sup> with exponents equal to the non-residues (or residues, respectively), and defines a self-dual circulant code. As an example, for *p* = 11 as listed in Appendix "Circulant Analysis *p* = 11", 2(*p*−1) − 1 = 1023 and *a*(*x*)<sup>341</sup> = *x* + *x* <sup>3</sup> + *x* <sup>4</sup> + *x* <sup>5</sup> + *x* 9, the quadratic nonresidues of 11 are 1, 4, 5, 9 and 3. *a*(*x*)<sup>682</sup> = *x* <sup>2</sup> + *x* <sup>6</sup> + *x* <sup>7</sup> + *x* <sup>8</sup> + *x* <sup>10</sup> corresponding to the quadratic residues: 2, 8, 10, 7 and 6 as can be seen from Appendix "Circulant Analysis *p* = 11". Section 9.4.3 discusses in more detail pure double-circulant codes for these primes.

# *9.3.2 Circulants Based Upon Prime Numbers Congruent to* **±***1 Modulo 8: Cyclic Codes*

MacWilliams and Sloane [13] discuss the Automorphism group of the extended cyclic quadratic residue (eQR) codes and show that this includes the projective special linear group *PSL*2(*p*). They describe a procedure in which a double-circulant code may be constructed from a codeword of the eQR code. It is fairly straightforward. The projective special linear group *PSL*2(*p*) for a prime *p* is defined by the permutation *<sup>y</sup>* <sup>→</sup> *ay*+*<sup>b</sup> cy*+*<sup>d</sup>* mod *<sup>p</sup>*, where the integers *<sup>a</sup>*, *<sup>b</sup>*, *<sup>c</sup>*, *<sup>d</sup>* are such that two cyclic groups of order *<sup>p</sup>*+<sup>1</sup> <sup>2</sup> are obtained. A codeword of the (*<sup>p</sup>* <sup>+</sup> <sup>1</sup>, *<sup>p</sup>*+<sup>1</sup> <sup>2</sup> ) eQR code is obtained and the non-zero coordinates of the codeword placed in each cyclic group. This splits the codeword into two cyclic parts each of which defines a circulant polynomial.

The procedure is best illustrated with an example. Let <sup>α</sup> <sup>∈</sup> <sup>F</sup>*p*<sup>2</sup> be a primitive (*p*<sup>2</sup> <sup>−</sup> <sup>1</sup>)ti root of unity; then, <sup>β</sup> <sup>=</sup> <sup>α</sup><sup>2</sup>*p*−<sup>2</sup> is a primitive <sup>1</sup> <sup>2</sup> (*p* + 1)TA root of unity since *<sup>p</sup>*<sup>2</sup> <sup>−</sup> <sup>1</sup> <sup>=</sup> <sup>1</sup> <sup>2</sup> (2*p* − 2)(*p* − 1). Let λ = 1/(1 + β) and *a* = λ<sup>2</sup> − λ; then, the permutation π<sup>1</sup> on a coordinate *y* is defined as

$$
\pi\_1: y \mapsto \frac{y+1}{ay} \mod p
$$

where π<sup>1</sup> ∈ PSL2(*p*) (see Sect. 9.4.3 for the definition of PSL2(*p*)). As an example, consider the prime *<sup>p</sup>* <sup>=</sup> 23. The permutation <sup>π</sup><sup>1</sup> : *<sup>y</sup>* <sup>→</sup> *<sup>y</sup>*+<sup>1</sup> 5*y* mod *p* produces the two cyclic groups

$$(1, 5, 3, 11, 9, 13, 8, 10, 20, 17, 4, 6)$$

and

$$(2, 21, 7, 16, 12, 19, 22, 0, 23, 14, 15, 18).$$

There are 3 cyclotomic cosets for *p* = 23 as follows:

$$\begin{aligned} C\_0 &= \{0\} \\ C\_1 &= \{1, 2, 4, 8, 16, 9, 18, 13, 3, 6, 12\} \\ C\_5 &= \{5, 10, 20, 17, 11, 22, 21, 19, 15, 7, 14\} .\end{aligned}$$

The idempotent given by *C*<sup>1</sup> may be used to define a generator polynomial, *r*(*x*), which defines the (23, 12, 7) cyclic quadratic residue code:

$$r(\mathbf{x}) = \mathbf{x} + \mathbf{x}^2 + \mathbf{x}^3 + \mathbf{x}^4 + \mathbf{x}^6 + \mathbf{x}^8 + \mathbf{x}^9 + \mathbf{x}^{12} + \mathbf{x}^{13} + \mathbf{x}^{16} + \mathbf{x}^{18}.\tag{9.10}$$

Codewords of the (23, 12, 7) cyclic code are given by *u*(*x*)*r*(*x*) modulo 1 + *x* <sup>23</sup> and with *u*(*x*) = 1 the non-zero coordinates of the codeword obtained are

(1, 2, 4, 8, 16, 9, 18, 13, 3, 6, 12)

the cyclotomic coset *C*1.

The extended code has an additional parity check using coordinate 23 to produce the corresponding codeword of the extended (24, 12, 8) code with the non-zero coordinates:

(1, 2, 4, 8, 16, 9, 18, 13, 3, 6, 12, 23).

Mapping these coordinates to the cyclic groups with 1in the position, where each coordinate is in the respective cyclic group and 0 otherwise, produces

$$(1,0,1,0,1,1,1,0,0,0,1,1)$$

and

$$(1,0,0,1,1,0,0,0,1,0,0,1)$$

which define the two circulant polynomials, *r*1(*x*) and *r*2(*x*), for the (24, 12, 8) pure double-circulant code

$$r\_1(\mathbf{x}) = 1 + \mathbf{x}^2 + \mathbf{x}^4 + \mathbf{x}^6 + \mathbf{x}^6 + \mathbf{x}^{10} + \mathbf{x}^{11}$$

$$r\_2(\mathbf{x}) = 1 + \mathbf{x}^3 + \mathbf{x}^4 + \mathbf{x}^8 + \mathbf{x}^{11}.\tag{9.11}$$

The inverse of *r*1(*x*) modulo (1 + *x* <sup>12</sup>) is ψ(*x*), where

$$
\psi(\mathbf{x}) = 1 + \mathbf{x} + \mathbf{x}^2 + \mathbf{x}^6 + \mathbf{x}^7 + \mathbf{x}^8 + \mathbf{x}^{10},
$$

and this may be used to produce an equivalent (24, 12, 8) pure double-circulant code which has the identity matrix as the first circulant


**Table 9.1** Double-circulant codes mostly based upon quadratic residues of prime numbers

aCodes with outstanding *dmin* bCodes not based on quadratic residues

The best (2*p*, *p*) circulant polynomial either contains the factor 1 + *x*: β(*x*) or is relatively prime to 1 <sup>+</sup> *<sup>x</sup>n*: *<sup>b</sup>*(*x*)

β(*x*) circulants can be bordered to produce (2*p* + 2, *p* + 1) circulants

$$\begin{aligned} \hat{r}\_1(\mathbf{x}) &= (1 + \mathbf{x}^2 + \mathbf{x}^4 + \mathbf{x}^5 + \mathbf{x}^6 + \mathbf{x}^{10} + \mathbf{x}^{11}) \psi(\mathbf{x}) \mod{\mathbf{u}}\ (\mathbf{l} + \mathbf{x}^{12})\\ \hat{r}\_2(\mathbf{x}) &= (1 + \mathbf{x}^3 + \mathbf{x}^4 + \mathbf{x}^8 + \mathbf{x}^{11}) \psi(\mathbf{x}) \mod{\mathbf{u}}\ (\mathbf{l} + \mathbf{x}^{12}). \end{aligned}$$

After evaluating terms, the two circulant polynomials are found to be

$$\begin{aligned} \hat{r}\_1(\mathbf{x}) &= \mathbf{l} \\ \hat{r}\_2(\mathbf{x}) &= \mathbf{l} + \mathbf{x} + \mathbf{x}^2 + \mathbf{x}^4 + \mathbf{x}^6 + \mathbf{x}^9 + \mathbf{x}^{11}, \end{aligned} \tag{9.12}$$

and it can be seen that the first circulant will produce the identity matrix of dimension 12. Jenson [8] lists the circulant codes for primes *p* < 200 that can be constructed in this way. There are two cases, *p* = 89 and *p* = 167, where a systematic doublecirculant construction is not possible. A non-systematic double-circulant code is possible for all cases but the existence of a systematic code depends upon one of the circulant matrices being non-singular. Apart from *p* = 89 and *p* = 167 (for *p* < 200) a systematic circulant code can always be constructed in each case.

Table 9.1 lists the best circulant codes as a function of length. Most of these codes are well known and have been previously published but not necessarily as circulant codes. Moreover, the *dmin* of some of the longer codes have only been bounded and have not been explicitly stated in the literature. Nearly, all of the best codes are codes


**Table 9.2** Generator polynomials for pure double-circulant codes

based upon the two types of quadratic residue circulant codes. For codes based upon *p* = ±3 *mod* 8, it is an open question whether a better circulant code exists than that given by the quadratic residues. For *p* = ±1 *mod* 8, there are counter examples. For example, the (72, 36, 14) code in Table 9.1 is better than the (72, 36, 12) circulant code which is based upon the extended cyclic quadratic residue code of length 71. The circulant generator polynomial g(x) for all of the codes of Table 9.1 is given in Table 9.2.

In Table 9.1, where the best (2*p*, *p*) code is given as *b*(*x*), the (2*p* + 2, *p* + 1) circulant code can still be constructed from β(*x*) but this code has the same *dmin* as the pure, double-circulant, shorter code. For example, for the prime 109, *b*(*x*) produces a double-circulant (218, 109, 30) code. The polynomial β(*x*) produces a double-circulant (218, 109, 29) code, which bordered becomes a (220, 110, 30) code. It should be noted that β(*x*) need not have the overall parity bit border added. In this case, a (2*p* + 1, *p* + 1) code is produced but with the same *dmin* as the β(*x*) code. In the latter example, a (219, 110, 29) code is produced.

#### **9.4 Code Construction**

Two binary linear codes, *A* and *B*, are *equivalent* if there exists a permutation π on the coordinates of the codewords which maps the codewords of *A* onto codewords of *B*. We shall write this as *B* = π(*A* ). If π transforms *C* into itself, then we say that π fixes the code, and the set of all permutations of this kind forms the automorphism group of *C* , denoted as Aut(*C* ). MacWilliams and Sloane [13] gives some necessary but not sufficient conditions on the equivalence of double-circulant codes, which are restated for convenience in the lemma below.

**Lemma 9.1** (cf. [13, Problem 7, Chap. 16]) *Let A and B be double-circulant codes with generator matrices* [*I <sup>k</sup>* |*A*] *and* [*I <sup>k</sup>* |*B*]*, respectively. Let the polynomials a*(*x*) *and b*(*x*) *be the defining polynomials of A and B. The codes A and B are equivalent if any of the following conditions holds:*


#### *Proof*


Consider an (*n*, *k*, *d*) pure double-circulant code, we can see that for a given user message, represented by a polynomial *u*(*x*) of degree at most *k* − 1, a codeword of the double-circulant code has the form (*u*(*x*)|*u*(*x*)*r*(*x*) (mod *x<sup>m</sup>* − 1)). The defining polynomial*r*(*x*) characterises the resulting double-circulant code. Before the choice of *r*(*x*) is discussed, consider the following lemmas and corollary.

**Lemma 9.2** *Let a*(*x*) *be a polynomial over* <sup>F</sup><sup>2</sup> *of degree at most m* <sup>−</sup> <sup>1</sup>*, i.e. <sup>a</sup>*(*x*) <sup>=</sup> *<sup>m</sup>*−<sup>1</sup> *<sup>i</sup>*=<sup>0</sup> *ai <sup>x</sup><sup>i</sup> where ai* ∈ {0, <sup>1</sup>}*. The weight of the polynomial* (<sup>1</sup> <sup>+</sup> *<sup>x</sup>*)*a*(*x*) (mod *x<sup>m</sup>* − 1)*, denoted by* wt*<sup>H</sup>* ((1 + *x*)*a*(*x*)) *is even.*

*Proof* Let *w* = wt*<sup>H</sup>* (*a*(*x*)) = wt*<sup>H</sup>* (*xa*(*x*)) and *S* = {*i* : *ai*+1 mod *<sup>m</sup>* = *ai* = 0, 0 ≤ *i* ≤ *m* − 1}:

$$\begin{aligned} \operatorname{wt}\_H((1+\mathfrak{x})a(\mathfrak{x})) &= \operatorname{wt}\_H(a(\mathfrak{x})) + \operatorname{wt}\_H(\mathfrak{x}a(\mathfrak{x})) - 2|S| \\ &= 2(\mathfrak{w} - |S|), \end{aligned}$$

which is even.

**Lemma 9.3** *An m* × *m circulant matrix R with defining polynomial r*(*x*) *is nonsingular if and only if r*(*x*) *is relatively prime to x<sup>m</sup>* − 1*.*

*Proof* If *r*(*x*) is not relatively prime to *x<sup>m</sup>* − 1, i.e. GCD (*r*(*x*), *x<sup>m</sup>* − 1) = *t*(*x*) for some polynomial *t*(*x*) = 1, then from the extended Euclidean algorithm, it follows that, for some unique polynomials *a*(*x*) and *b*(*x*), *r*(*x*)*a*(*x*) + (*x<sup>m</sup>* − 1)*b*(*x*) = 0, and therefore *R* is singular.

If *r*(*x*) is relatively prime to *x<sup>m</sup>* − 1, i.e. GCD (*r*(*x*), *x<sup>m</sup>* − 1) = 1, then from the extended Euclidean algorithm, it follows that, for some unique polynomials *a*(*x*) and *b*(*x*), *r*(*x*)*a*(*x*) + (*x<sup>m</sup>* − 1)*b*(*x*) = 1, which is equivalent to *r*(*x*)*a*(*x*) = 1 (mod *x<sup>m</sup>* − 1). Hence *R* is non-singular, being invertible with a matrix inverse whose defining polynomial is *a*(*x*).

**Corollary 9.1** *From Lemma 9.3,*


*Proof* The proof for the first case is obvious from the proof of Lemma 9.3. For the second case, if the weight of*r*(*x*) is even then *r*(*x*) is divisible by 1 + *x*. Since 1 + *x* is a factor of *x<sup>m</sup>* − 1 then *r*(*x*) is not relatively prime to *x<sup>m</sup>* − 1 and the weight of *r*(*x*) is necessarily odd. The inverse of *r*(*x*)−<sup>1</sup> is *r*(*x*) and for this to exist *r*(*x*)−<sup>1</sup> must be relatively prime to *x<sup>m</sup>* − 1 and the weight of *r*(*x*)−<sup>1</sup> is necessarily odd.

#### **Lemma 9.4** *Let p be an odd prime, and then*

*(i) p* | 2*<sup>p</sup>*−<sup>1</sup> − 1*, and (ii) the integer q for pq* = 2*<sup>p</sup>*−<sup>1</sup> − 1 *is odd.*

*Proof* From Fermat's little theorem, we know that for any integer *a* and a prime *p*, *a <sup>p</sup>*−<sup>1</sup> ≡ 1 (mod *p*). This is equivalent to *a <sup>p</sup>*−<sup>1</sup> − 1 = *pq* for some integer *q*. Let *a* = 2, we have

$$q = \frac{2^{p-1} - 1}{p}$$

which is clearly odd since neither denominator nor numerator contains 2 as a factor.

**Lemma 9.5** *Let p be a prime and j*(*x*) <sup>=</sup> *<sup>p</sup>*−<sup>1</sup> *<sup>i</sup>*=<sup>0</sup> *<sup>x</sup><sup>i</sup> ; then*

$$(1+x)^{2^{r-1}-1} = 1+j(x)\bmod(x^p-1).$$

*Proof* We can write (<sup>1</sup> <sup>+</sup> *<sup>x</sup>*)<sup>2</sup>*p*−1−<sup>1</sup> as

$$\begin{aligned} \left(1+x\right)^{2^{p-1}-1} &= \frac{\left(1+x\right)^{2^{p-1}}}{1+x} = \frac{1+x^{2^{p-1}}}{1+x} \\ &= \sum\_{i=0}^{2^{p-1}-1} x^i. \end{aligned}$$

From Lemma 9.4, we know that the integer *q* = (2*<sup>p</sup>*−<sup>1</sup> − 1)/*p* and is odd. We can then write <sup>2</sup>*p*−1−<sup>1</sup> *<sup>i</sup>*=<sup>0</sup> *<sup>x</sup><sup>i</sup>* in terms of *<sup>j</sup>*(*x*) as follows:

$$\sum\_{i=0}^{2^{p-1}-1} x^i = 1 + x \underbrace{\left(1 + x + \dots + x^{p-1}\right)}\_{j(x)} + x^{p+1} \underbrace{\left(1 + x + \dots + x^{p-1}\right)}\_{j(x)} + \dots + 1$$

$$x^{(q-3)p+1} \underbrace{\left(1 + x + \dots + x^{p-1}\right)}\_{j(x)} + x^{(q-2)p+1} \underbrace{\left(1 + x + \dots + x^{p-1}\right)}\_{j(x)} + \dots$$

$$x^{(q-1)p+1} \underbrace{\left(1 + x + \dots + x^{p-1}\right)}\_{j(x)}$$

$$= 1 + \underbrace{x j(x)(1 + x^p) + x^{2p+1} j(x)(1 + x^p) + \dots + x^{(q-3)p+1} j(x)(1 + x^p)}\_{J(x)} + \dots$$

$$x^{(q-1)p+1} j(x)$$

Since (1 + *x <sup>p</sup>*) (mod *x <sup>p</sup>* − 1) = 0 for a binary polynomial, *J* (*x*) = 0 and we have

$$\sum\_{i=0}^{2^{p-1}-1} x^i = 1 + x x^{(q-1)p} j(x) \pmod{x^p - 1}.$$

Because *xi p* (mod *x <sup>p</sup>* − 1) = 1,

$$\sum\_{i=0}^{2^{p-1}-1} x^i = 1 + xj(x) \pmod{x^p - 1}$$

$$= 1 + j(x) \pmod{x^p - 1}.$$

For the rest of this chapter, we consider the bordered case only and for convenience, unless otherwise stated, we shall assume that the term double-circulant code refers to (9.5b). Furthermore, we call the double-circulant codes based on primes congruent to <sup>±</sup>1 modulo 8, the [*<sup>p</sup>* <sup>+</sup> <sup>1</sup>, <sup>1</sup> <sup>2</sup> (*p* + 1), *d*] extended quadratic residue (QR) codes since these exist only for *p* ≡ ±1 (mod 8).

Following Gaborone [2], we call those double-circulant codes based on primes congruent to ±3 modulo 8 the [2(*p* + 1), *p* + 1, *d*] quadratic double-circulant (QDC) codes, i.e. *p* ≡ ±3 (mod 8).

# *9.4.1 Double-Circulant Codes from Extended Quadratic Residue Codes*

The following is a summary of the extended QR codes as double-circulant codes [8, 9, 13].

Binary QR codes are cyclic codes of length *p* over F2. For a given *p*, there exist four QR codes:

1. *L*¯ *<sup>p</sup>*, *N*¯ *<sup>p</sup>* which are equivalent and have dimension <sup>1</sup> <sup>2</sup> (*p* − 1), and 2. *Lp*, *N<sup>p</sup>* which are equivalent and have dimension <sup>1</sup> <sup>2</sup> (*p* + 1).

The (*<sup>p</sup>* <sup>+</sup> <sup>1</sup>, <sup>1</sup> <sup>2</sup> (*p* + 1), *d*) extended quadratic residue code, denoted by *L*ˆ *<sup>p</sup>* (resp. *N*ˆ *<sup>p</sup>*), is obtained by annexing an overall parity check to *L<sup>p</sup>* (resp. *Np*). If *p* ≡ −1 (mod 8), *L*ˆ *<sup>p</sup>* (resp. *N*ˆ *<sup>p</sup>*) is Type-II; otherwise it is FSD.

It is well known that1 Aut(*L*ˆ *<sup>p</sup>*) contains the projective special linear group denoted by PSL2(*p*) [13]. If *r* is a generator of the cyclic group *Q*, then σ : *i* → (mod *p*) is a member of PSL2(*p*). Given *n* ∈ *N*, the cycles of σ can be written as

<sup>1</sup>Since *L*ˆ *p* and *N*ˆ *p* are equivalent, considering either one is sufficient.

$$(\infty)(n, nr, nr^2, \dots, nr^t)(1, r, r^2, \dots, r^t)(0),\tag{9.13}$$

where *<sup>t</sup>* <sup>=</sup> <sup>1</sup> <sup>2</sup> (*p* − 3). Due to this property, *G*, the generator matrix of *L*ˆ *<sup>p</sup>* can be arranged into circulants as shown in (9.14),

$$\mathbf{G} = \begin{array}{c|ccccc} & \stackrel{\infty}{\text{\$\infty\$}} & nr & \dots & nr^{t-1} & nr^t & 1 & r & \dots & r^{t-1} & r^t & 0 \\ & \stackrel{\infty}{\text{\$\beta\$}} & \stackrel{\text{\$1 \; | \beta|}}{\text{\$\beta\$}} & & & & & & 1 \\ & \beta & & & & & & 1 \\ \beta r & 0 & & & & & 1 \\ \mathbf{G} = \begin{array}{c|c} \vdots & \vdots & \mathbf{L} & & \mathbf{R} & \vdots \\ \beta r^{t-1} & 0 & & & & & 1 \\ & & & & & & 1 \\ \hline \\ & & & & & 1 \\ \end{array} \end{array} \tag{9.14}$$

where *L* and *R* are <sup>1</sup> <sup>2</sup> (*<sup>p</sup>* <sup>−</sup> <sup>1</sup>) <sup>×</sup> <sup>1</sup> <sup>2</sup> (*p* − 1) circulant matrices. The rows β, β*r*,...,β*r<sup>t</sup>* in the above generator matrix contain *e*¯<sup>β</sup> (*x*), *e*¯β*<sup>r</sup>*(*x*), . . . , *e*¯β*rt*(*x*), where *e*¯*i*(*x*) = *xi e*(*x*) whose coordinates are arranged in the order of (9.13). Note that (9.14) can be transformed to (9.5b) as follows:

$$
\begin{bmatrix} 1 & J \\ \mathbf{0}^T \ L^{-1} \end{bmatrix} \times \begin{bmatrix} 1 & J \end{bmatrix} \begin{bmatrix} J & 1 \\ \mathbf{0}^T \ L \end{bmatrix} \begin{bmatrix} 1 \\ \mathbf{R} \ \mathbf{J}^T \end{bmatrix} = \begin{bmatrix} 1 & J + \mathbf{w}(\mathbf{L}^T) \begin{bmatrix} J + \mathbf{w}(\mathbf{R}^T) \ \frac{1}{2}(p+1) \\ \mathbf{L}^{-1}\mathbf{R} & \mathbf{w}(\mathbf{L}^{-1})^T \end{bmatrix} \tag{9.15}
$$

where *J* is an all-ones vector and **w**(*A*) = [wt*<sup>H</sup>* (*A*0) (mod 2), wt*<sup>H</sup>* (*A*1) (mod 2), . . .], *A<sup>i</sup>* being the *i*th row vector of matrix *A*. The multiplication in (9.15) assumes that *L*−<sup>1</sup> exists and following Corollary 9.1, wt*<sup>H</sup>* (*l* <sup>−</sup><sup>1</sup>(*x*)) = wt*<sup>H</sup>* (*l*(*x*)) is odd. Therefore, (9.15) becomes

$$\mathbf{G} = \begin{array}{|c|c|} \hline \begin{array}{|c|c|} \hline & \mathbf{J} + \mathbf{w}(\mathbf{R}^T) & \frac{1}{2}(p+1) \\ \hline \\ \frac{1}{2}(p+1) & & \\ \\ & L^{-1}\mathbf{R} & \vdots \\ & & 1 \\ \hline \\ 1 & & \\ \end{array} . \end{array} . \tag{9.16}$$

In relation to (9.14), consider extended QR codes for the classes of primes:

1. *p* = 8*m* + 1, the idempotent *e*(*x*) = *<sup>n</sup>*∈*<sup>N</sup> <sup>x</sup> <sup>n</sup>* and <sup>β</sup> <sup>∈</sup> *<sup>N</sup>*. Following [13, Theorem 24, Chap. 16], we know that *e*¯β*ri*(*x*) where β*r<sup>i</sup>* ∈ *N*, for 0 ≤ *i* ≤ *t*, contains 2*m* + 1 quadratic residues modulo *p* (including 0) and 2*m* − 1 non-quadratic residues modulo *<sup>p</sup>*. As a consequence, wt*<sup>H</sup>* (*r*(*x*)) is even, implying **<sup>w</sup>**(*R<sup>T</sup>* ) <sup>=</sup> **<sup>0</sup>** and *r*(*x*)is not invertible (cf. Corollary 9.1); and wt*<sup>H</sup>* (*l*(*x*))is odd and *l*(*x*) may be invertible over polynomial modulo *x* <sup>1</sup> <sup>2</sup> (*p*−1) − 1 (cf. Corollary 9.1). Furthermore, referring to (9.5b), we have <sup>α</sup> <sup>=</sup> <sup>1</sup> <sup>2</sup> (*p* + 1) = 4*m* + 1 = 1 mod 2.

2. *p* = 8*m* − 1, the idempotent *e*(*x*) = 1 + *<sup>n</sup>*∈*<sup>N</sup> <sup>x</sup> <sup>n</sup>* and <sup>β</sup> <sup>∈</sup> *<sup>Q</sup>*. Following [13, Theorem 24, Chap. 16], if we have a set *S* containing 0 and 4*m* − 1 non-quadratic residues modulo *p*, the set β + *S* contains 2*m* + 1 quadratic residues modulo *p* (including 0) and 2*m* − 1 non-quadratic residues modulo *p*. It follows that *e*¯β*ri*(*x*), where β*r<sup>i</sup>* ∈ *Q*, for 0 ≤ *i* ≤ *t*, contains 2*m* quadratic residues modulo *p* (excluding 0), implying that *R* is singular (cf. Corollary 9.1); and 2*m* − 1 non-quadratic residues modulo *p*, implying *L*−<sup>1</sup> may exist (cf. Corollary 9.1). Furthermore, **<sup>w</sup>**(*R<sup>T</sup>* ) <sup>=</sup> **<sup>0</sup>** and referring to (9.5b), we have <sup>α</sup> <sup>=</sup> <sup>1</sup> <sup>2</sup> (*p* + 1) = 4*m* = 0 mod 2.

For many *L*ˆ *<sup>p</sup>*, *L* is invertible and Karlin [9] has shown that *p* = 73, 97, 127, 137, 241 are the known cases where the canonical form (9.5b) cannot be obtained.

Consider the case for *p* = 73, with β = 5 ∈ *N*, we have *l*(*x*), the defining polynomial of the left circulant, given by

$$I(\mathbf{x}) = \mathbf{x}^2 + \mathbf{x}^3 + \mathbf{x}^4 + \mathbf{x}^5 + \mathbf{x}^6 + \mathbf{x}^{11} + \mathbf{x}^{15} + \mathbf{x}^{16} + \mathbf{x}^{18} + \mathbf{x}^{18} + \mathbf{x}^{19}$$

$$\mathbf{x}^{20} + \mathbf{x}^{21} + \mathbf{x}^{25} + \mathbf{x}^{30} + \mathbf{x}^{31} + \mathbf{x}^{32} + \mathbf{x}^{33} + \mathbf{x}^{34}$$

The polynomial *l*(*x*) contains some irreducible factors of *x* <sup>1</sup> <sup>2</sup> (*p*−1) − 1 = *x* <sup>36</sup> − 1, i.e. GCD (*l*(*x*), *x* <sup>36</sup> − 1) = 1 + *x* <sup>2</sup> + *x* 4, and hence, it is not invertible. In addition to form (9.5b), *G* can also be transformed to (9.5a), and Jenson [8] has shown that, for 7 ≤ *p* ≤ 199, except for *p* = 89, 167, the canonical form (9.5a) exists.

# *9.4.2 Pure Double-Circulant Codes for Primes* **±***3 Modulo 8*

Recall that **Sr** is a multiplicative group of order 2*<sup>p</sup>*−<sup>1</sup> − 1 containing all polynomials of odd weight (excluding the all-ones polynomial) of degree at most *p* − 1, where *p* is a prime. We assume that *a*(*x*) is a generator of **Sr**. For *p* ≡ ±3 (mod 8), we have the following lemma.

**Lemma 9.6** *For p* ≡ ±3 (mod 8)*, let the polynomials q*(*x*) = *<sup>i</sup>*∈*<sup>Q</sup> <sup>x</sup><sup>i</sup> and n*(*x*) <sup>=</sup> *<sup>i</sup>*∈*<sup>N</sup> <sup>x</sup><sup>i</sup> . Self-dual pure double-circulant codes with r*(*x*) = *q*(*x*) *or r*(*x*) = *n*(*x*) *exist if and only if p* ≡ 3 (mod 8)*.*

*Proof* For self-dual codes the condition *q*(*x*)*<sup>T</sup>* = *n*(*x*) must be satisfied where *q*(*x*)*<sup>T</sup>* = *q*(*x*−<sup>1</sup>) = *<sup>i</sup>*∈*<sup>Q</sup> <sup>x</sup>*−*<sup>i</sup>* . Let*r*(*x*) = *q*(*x*), for the case when *p* ≡ ±3 (mod 8), 2 ∈ *N* we have *q*(*x*)<sup>2</sup> = *<sup>i</sup>*∈*<sup>Q</sup> <sup>x</sup>* <sup>2</sup>*<sup>i</sup>* <sup>=</sup> *<sup>n</sup>*(*x*). We know that 1 <sup>+</sup> *<sup>q</sup>*(*x*) <sup>+</sup> *<sup>n</sup>*(*x*) <sup>=</sup> *<sup>j</sup>*(*x*), therefore, *q*(*x*)<sup>3</sup> = *q*(*x*)<sup>2</sup>*q*(*x*) = *n*(*x*)*q*(*x*) = (1 + *q*(*x*) + *j*(*x*))*q*(*x*) = *q*(*x*) + *<sup>n</sup>*(*x*) <sup>+</sup> *<sup>j</sup>*(*x*) <sup>=</sup> 1. Then, *<sup>q</sup>*(*x*)<sup>2</sup> *<sup>q</sup>*(*x*)<sup>3</sup> = *q*(*x*)<sup>2</sup> and *q*(*x*)<sup>2</sup> = *n*(*x*) = *q*(*x*)−<sup>1</sup> = *q*(*x*−<sup>1</sup>). On the other hand, −1 ∈ *N* if *p* ≡ 3 (mod 8) and thus *q*(*x*)*<sup>T</sup>* = *n*(*x*). If *p* ≡ −3 (mod 8), −1 ∈ *Q*, we have *q*(*x*)*<sup>T</sup>* = *q*(*x*). For *r*(*x*) = *n*(*x*), the same arguments follow.

Let *P<sup>p</sup>* denote a (2*p*, *p*, *d*) pure double-circulant code for *p* ≡ ±3 (mod 8). The properties of *P<sup>p</sup>* can be summarised as follows:

1. For *<sup>p</sup>* <sup>≡</sup> <sup>3</sup> (mod 8), since *<sup>q</sup>*(*x*)<sup>3</sup> <sup>=</sup> 1 and *<sup>a</sup>*2*p*−1−<sup>1</sup> <sup>=</sup> 1, we have *<sup>q</sup>*(*x*) <sup>=</sup> *<sup>a</sup>*(*x*)(<sup>2</sup>*p*−1−1)/<sup>3</sup> and *<sup>q</sup>*(*x*)*<sup>T</sup>* <sup>=</sup> *<sup>a</sup>*(*x*)(2*p*−2)/3. There are two full-rank generator matrices with mutually disjoint information sets associated with *P<sup>p</sup>* for these primes. Let *G*<sup>1</sup> be a reduced echelon generator matrix of *Pp*, which has the form of (9.5a) with *R* = *B*, where *B* is a circulant matrix with defining polynomial *b*(*x*) = *q*(*x*). The other full-rank generator matrix *G*<sup>2</sup> can be obtained as follows:

$$\mathbf{G}\_2 = \begin{array}{|c|c|c|} \hline \\ & \mathbf{B}^T & \times \mathbf{G}\_1 = \begin{array}{|c|c|c|} \hline \\ & \mathbf{B}^T & \\ & & \end{array} & \mathbf{I}\_p & . \end{array} . \tag{9.17}$$

The self-duality of this pure double-circulant code is obvious from *G*2.

2. For *p* ≡ −3 (mod 8), (*p* − 1)/2 is even and hence, neither *q*(*x*) nor *n*(*x*) is invertible, which means that if this polynomial was chosen as the defining polynomial for *Pp*, there exists only one full-rank generator matrix. However, either 1 + *q*(*x*)(resp. 1 + *n*(*x*)) is invertible and the inverse is 1 + *n*(*x*)(resp. 1 + *q*(*x*)), i.e.

$$\begin{aligned} (1+q(\mathbf{x}))(\mathbf{l}+n(\mathbf{x})) &= \mathbf{l} + q(\mathbf{x}) + n(\mathbf{x}) + q(\mathbf{x})n(\mathbf{x}) \\ &= \mathbf{l} + q(\mathbf{x}) + n(\mathbf{x}) + q(\mathbf{x})(\mathbf{l} + j(\mathbf{x}) + q(\mathbf{x})) \\ &= \mathbf{l} + q(\mathbf{x}) + n(\mathbf{x}) + q(\mathbf{x}) + q(\mathbf{x})j(\mathbf{x}) + q(\mathbf{x})^2, \end{aligned}$$

and since *q*(*x*)*j*(*x*) = 0 and *q*(*x*)<sup>2</sup> = *n*(*x*) under polynomial modulo *x <sup>p</sup>* − 1, it follows that

$$(1 + q(\mathbf{x}))(1 + n(\mathbf{x})) = 1 \pmod{\mathbf{x}^p - 1}.$$

Let *G*<sup>1</sup> be the first reduced echelon generator matrix, which has the form of (9.5a) where *R* = *I <sup>p</sup>* + *Q*. The other full-rank generator matrix with disjoint information sets *G*<sup>2</sup> can be obtained as follows:

$$\mathbf{G}\_2 = \begin{array}{c|cc} & & & \\ & I\_p + N & \times \mathbf{G}\_1 = \begin{array}{c|cc} & & & \\ \hline \end{array} & \begin{array}{c|cc} & & & I\_p & \\ & & & \\ \hline \end{array} & \begin{array}{c|cc} & & & \\ \hline \end{array} & \begin{array}{c|cc} & & & \\ \hline \end{array} \end{array} \tag{9.18}$$

Since −1 ∈ *Q* for this prime, (*I <sup>p</sup>* + *Q*)*<sup>T</sup>* = *I <sup>p</sup>* + *Q* implying that the (2*p*, *p*, *d*) pure double-circulant code is FSD, i.e. the generator matrix of *P*<sup>⊥</sup> *<sup>p</sup>* is given by

$$\mathcal{G}^{\perp} = \left\lfloor \begin{array}{c} I\_p + \mathcal{Q} \\\\ \hline \\ \hline \end{array} \right\rfloor \quad \quad I\_p \qquad \quad \begin{array}{c} \hline \\ \hline \\ \hline \\ \hline \end{array}$$

A bordered double-circulant construction based on these primes—commonly known as the *quadratic double-circulant* construction—also exists, see Sect. 9.4.3 below.

#### *9.4.3 Quadratic Double-Circulant Codes*

Let *p* be a prime that is congruent to ±3 modulo 8. A (2(*p* + 1), *p* + 1, *d*) binary quadratic double-circulant code, denoted by *Bp*, can be constructed using the defining polynomial

$$b(\mathbf{x}) = \begin{cases} 1 + q(\mathbf{x}) & \text{if } p \equiv 3 \pmod{8}, \text{ and} \\ q(\mathbf{x}) & \text{if } p \equiv -3 \pmod{8} \end{cases} \tag{9.19}$$

where *q*(*x*) = *<sup>i</sup>*∈*<sup>Q</sup> <sup>x</sup><sup>i</sup>* . Following [13], the generator matrix *G* of *B<sup>p</sup>* is

$$\mathbf{G} = \begin{array}{c|c} l\_{\infty} \, l\_{0} \, \dots \, l\_{p-1} \, r\_{\infty} \, r\_{0} \, \dots \, r\_{p-1} \\ \hline \begin{bmatrix} 1 & & 0 \\ \vdots & & \vdots \\ 1 & & 0 \\ \hline 0 \, 0 \, \dots \, 0 & 1 \, 1 \, \dots \, 1 \end{bmatrix} \\ \hline \end{array} \tag{9.20}$$

which is, if the last row of *G* is rearranged as the first row, the columns indexed by *l*<sup>∞</sup> and *r*<sup>∞</sup> are rearranged as the last and the first columns, respectively, equivalent to (9.5b) with α = 0 and *k* = *p* + 1. Let *j*(*x*) = 1 + *x* + *x* <sup>2</sup> +···+ *x <sup>p</sup>*−1, and the following are some properties of *B<sup>p</sup>* [9]:

1. for *p* ≡ 3 (mod 8), *b*(*x*)<sup>3</sup> = (1 + *q*(*x*))<sup>2</sup>(1 + *q*(*x*)) = (1 + *n*(*x*))(1 + *q*(*x*)) = 1 + *j*(*x*), since *q*(*x*)<sup>2</sup> = *n*(*x*)(2 ∈ *N* for this prime) and *q*(*x*)*j*(*x*) = *n*(*x*)*j*(*x*) = *j*(*x*) (wt*<sup>H</sup>* (*q*(*x*)) = wt*<sup>H</sup>* (*n*(*x*)) is odd). Also, (*b*(*x*) + *j*(*x*))<sup>3</sup> = (1 + *q*(*x*) + *j*(*x*))<sup>2</sup>(1 + *q*(*x*) + *j*(*x*)) = *n*(*x*)<sup>2</sup>(1 + *q*(*x*) + *j*(*x*)) = *q*(*x*) + *n*(*x*) + *j*(*x*) = 1 because *n*(*x*)<sup>2</sup> = *q*(*x*). Since −1 ∈ *N* and we have *b*(*x*)*<sup>T</sup>* = 1 + *<sup>i</sup>*∈*<sup>Q</sup> <sup>x</sup>*−*<sup>i</sup>* <sup>=</sup> 1 + *n*(*x*) and thus, *b*(*x*)*b*(*x*)*<sup>T</sup>* = (1 + *q*(*x*))(1 + *n*(*x*)) = 1 + *j*(*x*).

There are two generator full-rank matrices with disjoint information sets for *Bp*. This is because, although *b*(*x*) has no inverse, *b*(*x*) + *j*(*x*) does, and the inverse is (*b*(*x*) + *j*(*x*))2.

Let *G*<sup>1</sup> has the form of (9.5b) where *R* = *B*, and the other full-rank generator matrix *G*<sup>2</sup> can be obtained as follows:

#### 9.4 Code Construction 223

$$\mathbf{G}\_2 = \begin{array}{c|cccc} & 1 & 1 & \dots & 1 \\ \hline 0 & & & & \\ \vdots & & \mathbf{B}^T & & \\ \hline 0 & & & & \\ \hline 0 & & & \end{array} \times \mathbf{G}\_1 = \begin{array}{c|c} & \mathbf{0} \mid 1 & \dots & 1 \mid 1 \mid 0 & \dots & 0 \\ \hline 1 & & & & 0 \\ \vdots & & \mathbf{B}^T & \vdots & & \mathbf{I}\_p \\ \hline 1 & & & & 0 \\ \end{array} . \tag{9.21}$$

It is obvious that *G*<sup>2</sup> is identical to the generator matrix of *B*<sup>⊥</sup> *<sup>p</sup>* and hence, it is self-dual.

2. for *p* ≡ −3 (mod 8), *b*(*x*)<sup>3</sup> = *n*(*x*)*q*(*x*) = (1 + *j*(*x*) + *q*(*x*))*q*(*x*) = 1 + *j*(*x*) since *q*(*x*)<sup>2</sup> = *n*(*x*) (2 ∈ *N* for this prime) and *q*(*x*)*j*(*x*) = *n*(*x*)*j*(*x*) = 0 (wt*<sup>H</sup>* (*q*(*x*)) = wt*<sup>H</sup>* (*n*(*x*)) is even). Also, (*b*(*x*) + *j*(*x*))<sup>3</sup> = (*q*(*x*) + *j*(*x*))<sup>2</sup>(1 + *n*(*x*)) = *q*(*x*)<sup>2</sup> + *q*(*x*)<sup>2</sup>*n*(*x*) + *j*(*x*)<sup>2</sup> + *j*(*x*)<sup>2</sup>*n*(*x*) = *n*(*x*) + *q*(*x*) + *j*(*x*) = 1 because *n*(*x*)<sup>2</sup> = *q*(*x*). Since −1 ∈ *Q* and we have *b*(*x*)*<sup>T</sup>* = *<sup>i</sup>*∈*<sup>Q</sup> <sup>x</sup>*−*<sup>i</sup>* <sup>=</sup> *<sup>b</sup>*(*x*) and it follows that *B<sup>p</sup>* is FSD, i.e. the generator matrix of *B*<sup>⊥</sup> *<sup>p</sup>* is given by

Since (*b*(*x*) + *j*(*x*))−<sup>1</sup> = (*b*(*x*) + *j*(*x*))2, there exist full-rank two generator matrices of disjoint information sets for *Bp*. Let *G*<sup>1</sup> has the form of (9.5b) where *R* = *B*, and the other full-rank generator matrix *G*<sup>2</sup> can be obtained as follows:

$$\mathbf{G}\_2 = \begin{array}{|c|c|c|}
\hline
\mathbf{0} & \text{1} & \text{1} & \text{1} & \text{1} & \text{0} & \dots & \text{0} \\
\hline
\mathbf{0} & & & & & \mathbf{0} & & & \\
\vdots & \mathbf{B}^2 & & & & \mathbf{0} & & & \\
\hline
\mathbf{0} & & & & & \mathbf{0}^2 & \vdots & \mathbf{I}\_p \\
\hline
\mathbf{0} & & & & & \mathbf{1} & & \mathbf{0} & & \\
\hline
\end{array} \tag{9.22}$$

Codes of the form *B<sup>p</sup>* form an interesting family of double-circulant codes. In terms of self-dual codes, the family contains the longest extremal Type-II code known, *n* = 136. Probably, it is the longest extremal code that exists, see Sect. 9.7. Moreover, *B<sup>p</sup>* is the binary image of the extended QR code over F<sup>4</sup> [10].

The (*<sup>p</sup>* <sup>+</sup> <sup>1</sup>, <sup>1</sup> <sup>2</sup> (*p* + 1), *d*) double-circulant codes for *p* ≡ ±1 (mod 8) are fixed by PSL2(*p*), see Sect. 9.4.1. This linear group PSL2(*p*) is generated by the set of all permutations to the coordinates (∞, 0, 1,..., *p* − 1) of the form

$$\mathbf{y} \to \frac{a\mathbf{y} + b}{c\mathbf{y} + d},\tag{9.23}$$

where *<sup>a</sup>*, *<sup>b</sup>*, *<sup>c</sup>*, *<sup>d</sup>* <sup>∈</sup> <sup>F</sup>*p*, *ad* <sup>−</sup> *bc* <sup>=</sup> 1, *<sup>y</sup>* <sup>∈</sup> <sup>F</sup>*<sup>p</sup>* ∪ {∞}, and it is assumed that <sup>±</sup><sup>1</sup> <sup>0</sup> = ∞ and <sup>±</sup> <sup>1</sup> <sup>∞</sup> <sup>=</sup> 0 in the arithmetic operations.

We know from [13] that this form of permutation is generated by the following transformations:

$$\begin{array}{l} \mathcal{S} &:\ y \rightarrow \mathcal{y} + 1 \\ V &:\ y \rightarrow a^2 \mathcal{y} \\ T &:\ y \rightarrow -\frac{1}{\mathcal{Y}}, \end{array} \tag{9.24}$$

where α is a primitive element of F*p*. In fact, *V* is redundant since it can be obtained from *S* and *T* , i.e.

$$V = T S^a T S^\mu T S^a \tag{9.25}$$

for<sup>2</sup> <sup>μ</sup> <sup>=</sup> <sup>α</sup>−<sup>1</sup> <sup>∈</sup> <sup>F</sup>*p*.

The linear group PSL2(*p*) fixes not only the (*<sup>p</sup>* <sup>+</sup> <sup>1</sup>, <sup>1</sup> <sup>2</sup> (*p* + 1), *d*) binary doublecirculant codes, for *p* ≡ ±1 (mod 8), but also the (2(*p* + 1), *p* + 1, *d*) binary quadratic double-circulant codes, as shown as follows. Consider the coordinates (∞, 0, 1,..., *p* − 1) of a circulant, the transformation *S* leaves the coordinate <sup>∞</sup> invariant and introduces a cyclic shift to the rest of the coordinates and hence *S* fixes a circulant. Let *R<sup>i</sup>* and *L<sup>i</sup>* denote the *i*th row of the right and left circulants of (9.20), respectively (we assume that the index starts with 0), and let *J* and *J* denote the last row of the right and left circulant of (9.20), respectively.

Consider the primes *p* = 8*m* + 3, *R*<sup>0</sup> = 0 | 1 + *<sup>i</sup>*∈*<sup>Q</sup> <sup>x</sup><sup>i</sup>* . Let *ei* and *f <sup>j</sup>* , for some integers *i* and *j*, be even and odd integers, respectively. If *i* ∈ *Q*, −1/*i* = −1 × α*<sup>p</sup>*−<sup>1</sup>/α*<sup>e</sup>*<sup>1</sup> = α *<sup>f</sup>*<sup>1</sup> × α*<sup>e</sup>*2−*e*<sup>1</sup> ∈ *N* since −1 ∈ *N* for these primes. Therefore, the transformation *T* interchanges residues to non-residues and vice versa. In addition, we also know that *T* interchanges coordinates ∞ and 0. Applying transformation *T* to *R*0, *T* (*R*0), results in

$$T(\mathcal{R}\_0) = \left(1 \mid \sum\_{j \in N} \boldsymbol{\chi}^j\right) = \mathcal{R}\_0 + \boldsymbol{J}.$$

Similarly, for the first row of *L*, which has 1 at coordinates ∞ and 0 only, i.e. *L*<sup>0</sup> = (1 | 1)

$$T(L\_0) = L\_0 + J.$$

$$\begin{split} & \, ^2TS^aTS^\muTS^a(\mathbf{y}) = TS^aTS^\mu T(\mathbf{y} + a) = TS^aTS^\mu (-\mathbf{y}^{-1} + a) = TS^aT \left( -\frac{1}{\mathbf{y} + \mu} + a \right) = 0\\ & \left( \frac{^aTS^a}{\mathbf{y} + \mu} \right) = TS^a \left( \frac{-a\mathbf{y}^{-1} + a\mu - 1}{-\mathbf{y}^{-1} + \mu} \right) = T \left( \frac{-a(\mathbf{y} + a)^{-1} + a\mu - 1}{-(\mathbf{y} + a)^{-1} + \mu} \right) = T \left( \frac{(a\mu - 1)\mathbf{y} + a(a\mu - 1) - a}{\mu \mathbf{y} + (a\mu - 1)} \right) = 0\\ & \left( \frac{(-a\mu - 1)\mathbf{y}^{-1} + a(a\mu - 1) - a}{-\mu \mathbf{y}^{-1} + (a\mu - 1)} \right) = a^2 \mathbf{y} = V(\mathbf{y}). \end{split}$$

Let*s* ∈ *Q* and let the set *Q*ˆ = *Q* ∪ {0}, *R<sup>s</sup>* = 0 | *<sup>i</sup>*∈*Q*<sup>ˆ</sup> *<sup>x</sup><sup>s</sup>*+*<sup>i</sup>* and *T <sup>i</sup>*∈*Q*<sup>ˆ</sup> *<sup>x</sup><sup>s</sup>*+*<sup>i</sup>* = *<sup>i</sup>*∈*Q*<sup>ˆ</sup> *<sup>x</sup>*−1/(*s*+*i*) . Following MacWilliams and Sloane [13, Theorem 24, Chap. 16], we know that the exponents of *<sup>i</sup>*∈*Q*<sup>ˆ</sup> *<sup>x</sup><sup>s</sup>*+*<sup>i</sup>* contain 2*<sup>m</sup>* <sup>+</sup> 1 residues and 2*m* + 1 non-residues. Note that *s* + *i* produces no 0.3 It follows that −1/(*s* + *<sup>i</sup>*) contains 2*<sup>m</sup>* <sup>+</sup> 1 non-residues and 2*<sup>m</sup>* <sup>+</sup> 1 residues. Now consider *<sup>R</sup>*−1/*<sup>s</sup>* <sup>=</sup> 0 | *<sup>i</sup>*∈*Q*<sup>ˆ</sup> *<sup>x</sup><sup>i</sup>*−1/*<sup>s</sup>* , *i* − 1/*s* contains<sup>4</sup> 0 *i*,*s* ∈ *Q*, 2*m* residues and 2*m* + 1 nonresidues. We can write −1/(*s* + *i*) as

$$-\frac{1}{s+i} = \frac{i/s}{s+i} - \frac{1}{s} = z - \frac{1}{s} \dots$$

Let *I* ⊂ *Q*ˆ be a set of all residues such that for all *i* ∈ *I*, *i* − 1/*s* ∈ *N*. If −1/ (*s* + *i*) ∈ *N*, *z* ∈ *Q*ˆ and we can see that *z* must belong to *I* such that *z* − 1/*s* ∈ *N*. This means these non-residues cancel each other in *T* (*Rs*) + *R*−1/*<sup>s</sup>*. On the other hand, if −1/(*s* + *i*) ∈ *Q*, *z* ∈ *N* and it is obvious that *z* − 1/*s* = *i* − 1/*s* for all *i* ∈ *Q*ˆ , implying that all 2*m* + 1 residues in *T* (*Rs*) are disjoint from all 2*m* + 1 residues (including 0) in *R*−1/*<sup>s</sup>*. Therefore, *T* (*Rs*) + *R*−1/*<sup>s</sup>* = 0 | *<sup>i</sup>*∈*Q*<sup>ˆ</sup> *<sup>x</sup><sup>i</sup>* , i.e.

$$T(\mathcal{R}\_s) = \mathcal{R}\_{-1/s} + \mathcal{R}\_0.$$

Similarly, *T* (*Ls*) = 0 | 1 + *x*−1/*<sup>s</sup>* and *L*−1/*<sup>s</sup>* = 1 | *x*−1/*<sup>s</sup>* , which means

$$T(L\_s) = L\_{-1/s} + L\_0.$$

Let *t* ∈ *N*, *R<sup>t</sup>* = 0 | *<sup>i</sup>*∈*Q*<sup>ˆ</sup> *<sup>x</sup>t*+*<sup>i</sup>* and *T <sup>i</sup>*∈*Q*<sup>ˆ</sup> *<sup>x</sup>t*+*<sup>i</sup>* = *<sup>i</sup>*∈*Q*<sup>ˆ</sup> *<sup>x</sup>*−1/(*t*+*i*) . We know that *t* + *i* contains 0, 2*m* residues and 2*m* + 1 non-residues [13, Theorem 24, Chap. 16], and correspondingly −1/(*t* + *i*) contains ∞, 2*m* non-residues and 2*m* + 1 residues. As before, now consider *R*−1/*<sup>t</sup>* = 0 | *<sup>i</sup>*∈*Q*<sup>ˆ</sup> *<sup>x</sup><sup>i</sup>*−1/*<sup>t</sup>* . There are 2*m* + 1 residues (excluding 0) and 2*m* + 1 non-residues in *i* − 1/*t*, and let *I* ⊂ *Q*ˆ be a set of all residues such that, for all *i* ∈ *I* , *i* − 1/*t* ∈ *Q*. As before, we can write −1/(*t* + *i*) as *z* − 1/*t*, where *z* = (*i*/*t*)/(*t* + *i*). If −1/(*t* + *i*) ∈ *Q*, *z* ∈ *I* and hence, the 2*m* + 1 residues from −1/(*t* + *i*) are identical to those from *i* − 1/*t*. If −1/(*t* + *i*) ∈ *N*, *z* ∈ *N* and hence, all of the 2*m* non-residues of −1/(*t* + *i*) are disjoint from all 2*<sup>m</sup>* <sup>+</sup> 1 non-residues of *<sup>i</sup>* <sup>−</sup> <sup>1</sup>/*t*. Therefore, *<sup>T</sup>* (*Rt*) <sup>+</sup> *<sup>R</sup>*−1/*<sup>t</sup>* <sup>=</sup> 1 | *<sup>i</sup>*∈*<sup>N</sup> <sup>x</sup><sup>i</sup>* , i.e.

$$T(\mathbf{R}\_t) = \mathbf{R}\_{-1/t} + \mathbf{R}\_0 + J.$$

<sup>3</sup>Consider a prime *<sup>p</sup>* = ±<sup>3</sup> (mod 8), *<sup>q</sup>* <sup>∈</sup> *<sup>Q</sup>* and an integer *<sup>a</sup>* where (*a*, *<sup>p</sup>*) <sup>=</sup> 1. In order for *q* + *a* = 0 to happen, *a* = −*q*. The integer *a* is a residue if *p* = 8*m* − 3 and a non-residue if *p* = 8*m* + 3.

<sup>4</sup>This is because all *<sup>i</sup>* <sup>∈</sup> *<sup>Q</sup>* are considered and 1/*<sup>s</sup>* <sup>∈</sup> *<sup>Q</sup>*.

Similarly, *T* (*Lt*) = 0 | 1 + *x*−1/*<sup>t</sup>* and *L*−1/*<sup>t</sup>* = 1 | *x*−1/*<sup>t</sup>* , which means

$$T(L\_t) = L\_{-1/t} + L\_0 + J'.$$

For primes *p* = 8*m* − 3, *R*<sup>0</sup> = 0 | *<sup>i</sup>*∈*<sup>Q</sup> <sup>x</sup><sup>i</sup>* and since −1 ∈ *Q*, −1/*i* ∈ *Q* for *i* ∈ *Q*. Thus,

$$T(\mathcal{R}\_0) = \left(0 \mid \sum\_{i \in \mathcal{Q}} x^{-1/i} \right) = \mathcal{R}\_0.$$

Similarly, for *L*0, which contains 1 at coordinates 0 and ∞,

$$T\left(L\_0\right) = L\_0.$$

Consider *R<sup>s</sup>* = 0 | *<sup>i</sup>*∈*<sup>Q</sup> <sup>x</sup>s*+*<sup>i</sup>* , for*s* ∈ *Q*, *T <sup>i</sup>*∈*<sup>Q</sup> <sup>x</sup>s*+*<sup>i</sup>* = *<sup>i</sup>*∈*<sup>Q</sup> <sup>x</sup>*−1/(*s*+*i*) . There are 0 (when *i* = −*s* ∈ *Q*), 2*m* − 2 residues and 2*m* − 1 non-residues in the set *s* + *i* [13, Theorem 24, Chap. 16]. Correspondingly, −1/(*s* + *i*) = *z* − 1/*s*, where *z* = (*i*/*s*)/(*s* + *i*), contains ∞, 2*m* − 2 residues and 2*m* − 1 non-residues. Now consider *R*−1/*<sup>s</sup>* = 0 | *<sup>i</sup>*∈*<sup>Q</sup> <sup>x</sup>i*−1/*<sup>s</sup>* , the set *i* − 1/*s* contains 0 (when *i* = 1/*s* ∈ *Q*), 2*m* − 2 residues and 2*m* − 1 non-residues. Let *I* ⊂ *Q* be a set of all residues such that for all *i* ∈ *I*, *i* − 1/*s* ∈ *Q*. If −1/(*s* + *i*) ∈ *Q* then *z* − 1/*s* ∈ *Q* which means *z* ∈ *Q* and *z* must belong to *I*. This means all 2*m* − 2 residues of −1/(*s* + *i*) and those of *i* − 1/*s* are identical. On the contrary, if −1/(*s* + *i*) ∈ *N*, *z* ∈ *N* and this means *z* − 1/*s* = *i* − 1/*s* for all *i* ∈ *Q*, and therefore all non-residues in −1/(*s* + *i*) and *i* − 1/*s* are mutually disjoint. Thus, *T* (*Rs*) + *R*−1/*<sup>s</sup>* = 1 | 1 + *<sup>i</sup>*∈*<sup>N</sup> <sup>x</sup><sup>i</sup>* , i.e.

$$T(\mathbf{R}\_s) = \mathbf{R}\_{-1/s} + \mathbf{R}\_0 + J.$$

Similarly, *T* (*Ls*) = 0 | 1 + *x*−1/*<sup>s</sup>* , and we can write

$$T(L\_s) = L\_{-1/s} + L\_0 + J'.$$

For*t* ∈ *N*, we have *R<sup>t</sup>* = 0 | *<sup>i</sup>*∈*<sup>Q</sup> <sup>x</sup><sup>t</sup>*+*<sup>i</sup>* and *T* ( *<sup>i</sup>*∈*<sup>Q</sup> <sup>x</sup><sup>t</sup>*+*<sup>i</sup>* ) = *<sup>i</sup>*∈*<sup>Q</sup> <sup>x</sup>*−1/(*t*+*i*) . Following [13, Theorem 24, Chap. 16], there are 2*m* − 1 residues and 2*m* − 1 nonresidues in the set *t* + *i* and the same distributions are contained in the set −1/(*t* + *i*). Considering *R*−1/*<sup>t</sup>* = 0 | *<sup>i</sup>*∈*<sup>Q</sup> <sup>x</sup><sup>i</sup>*−1/*<sup>t</sup>* , there are 2*m* − 1 residues and 2*m* − 1 nonresidues in *i* − 1/*t*. Rewriting −1/(*t* + *i*) = *z* − 1/*t*, for *z* = (*i*/*t*)/(*t* + *i*), and letting *I* ⊂ *Q* be a set of all residues such that for all *i* ∈ *I* , *i* − 1/*t* ∈ *N*, we know that if −1/(*t* + *i*) ∈ *N* then *z* − 1/*t* ∈ *N* which means that *z* ∈ *Q* and *z* must belong to *I* . Hence, the non-residues in *i* − 1/*t* and −1/(*t* + *i*) are identical. If −1/(*t* + *i*) ∈ *Q*, however, *z* ∈ *N* and for all *i* ∈ *Q*, *i* − 1/*t* = *z* − 1/*t*, implying that the residues in −1/(*t* + *i*) and *i* − 1/*t* are mutually disjoint. Thus, *T* (*Rt*) + *R*−1/*<sup>t</sup>* = 0 | *<sup>i</sup>*∈*<sup>Q</sup> <sup>x</sup><sup>i</sup>* , i.e.

$$T(\mathcal{R}\_t) = \mathcal{R}\_{-1/t} + \mathcal{R}\_0.$$

Similarly, *T* (*Lt*) = 0 | 1 + *x*−1/*<sup>t</sup>* , and we can write

$$T(L\_t) = L\_{-1/t} + L\_0.$$

The effect *T* to the circulants is summarised as follows:


where *s* ∈ *Q* and *t* ∈ *N*. This shows that, for *p* ≡ ±3 (mod 8), the transformation *T* is a linear combination of at most three rows of the circulant and hence it fixes the circulant. This establishes the following theorem on Aut(*Bp*) [2, 13].

**Theorem 9.1** *The automorphism group of the* (2(*p* + 1), *p* + 1, *d*) *binary quadratic double-circulant codes contains PSL*2(*p*) *applied simultaneously to both circulants.*

The knowledge of Aut(*Bp*) can be exploited to deduce the modular congruence weight distributions of *B<sup>p</sup>* as shown in Sect. 9.6.

# **9.5 Evaluation of the Number of Codewords of Given Weight and the Minimum Distance: A More Efficient Approach**

In Chap. 5 algorithms to compute the minimum distance of a binary linear code and to count the number of codewords of a given weight are described. Assuming the code rate of the code is a half and its generator matrix contains two mutually disjoint information sets, each of rank *k* (the code dimension), these algorithms require enumeration of

$$
\binom{k}{w/2} + 2\sum\_{i=1}^{w/2-1} \binom{k}{i}
$$

codewords in order to count the number of codewords of weight *w*. For FSD doublecirculant codes with *p* ≡ −3 (mod 8) and self-dual double-circulant codes a more efficient approach exists. This approach applies to both pure and bordered doublecirculant cases.

**Lemma 9.7** *Let Tm*(*x*) *be a set of binary polynomials with degree at most m. Let ui*(*x*), *vi*(*x*) ∈ *Tk*−1(*x*) *for i* = 1, 2*, and e*(*x*), *f* (*x*) ∈ *Tk*−2(*x*)*. The numbers of weight w codewords of the form c*1(*x*) = (*u*1(*x*)|*v*1(*x*)) *and c*2(*x*) = (*v*2(*x*)|*u*2(*x*)) *are equal, where*


#### *Proof*

	- a. ε = 0 and wt*<sup>H</sup>* (*e*(*x*)) is odd, we have a codeword *c*1(*x*) = (0 | *e*(*x*) | 1 | *f* (*x*)) from *G*1. Applying 0 | *e*(*x*)*<sup>T</sup>* as an information vector to *G*2, we have another codeword *c*2(*x*) = 1 | *e*(*x*)*<sup>T</sup> r*(*x*)*<sup>T</sup>* | 0 | *e*(*x*)*<sup>T</sup>* = 1 | *f* (*x*)*<sup>T</sup>* | 0 | *e*(*x*)*<sup>T</sup>* .
	- b. ε = 1 and wt*<sup>H</sup>* (*e*(*x*)) is odd, *G*<sup>1</sup> produces *c*1(*x*) = (1 | *e*(*x*) | 1 | *f* (*x*) + *j*(*x*)). Applying 1 | *e*(*x*)*<sup>T</sup>* as an information vector to *G*2, we have a codeword *c*2(*x*) = 1 | *e*(*x*)*<sup>T</sup> r*(*x*)*<sup>T</sup>* + *j*(*x*) | 1 | *e*(*x*)*<sup>T</sup>* = 1 | *f* (*x*)*<sup>T</sup>* + *j*(*x*) | 1 | *e*(*x*)*<sup>T</sup>* .
	- c. ε = 0 and wt*<sup>H</sup>* (*e*(*x*))is even, *G*<sup>1</sup> produces a codeword *c*1(*x*) = (0 | *e*(*x*) | 0 | *f* (*x*)). Applying 0 | *e*(*x*)*<sup>T</sup>* as an information vector to *G*2, we have another codeword *c*2(*x*) = 0 | *e*(*x*)*<sup>T</sup> r*(*x*)*<sup>T</sup>* | 0 | *e*(*x*)*<sup>T</sup>* = 0 | *f* (*x*)*<sup>T</sup>* | 0 | *e*(*x*)*<sup>T</sup>* .
	- d. ε = 1 and wt*<sup>H</sup>* (*e*(*x*)) is even, *G*<sup>1</sup> produces *c*1(*x*) = (1 | *e*(*x*) | 0 | *f* (*x*)+ *j*(*x*)). Applying 1 | *e*(*x*)*<sup>T</sup>* as an information vector to *G*2, we have a codeword *c*2(*x*) = 0 | *e*(*x*)*<sup>T</sup> r*(*x*)*<sup>T</sup>* + *j*(*x*) | 1 | *e*(*x*)*<sup>T</sup>* = 0 | *f* (*x*)*<sup>T</sup>* + *j*(*x*) | 1 | *e*(*x*)*<sup>T</sup>* .

It is clear that in all cases, wt*<sup>H</sup>* (*c*1(*x*)) = wt*<sup>H</sup>* (*c*2(*x*)) since wt*<sup>H</sup>* (*v*(*x*)) = wt*<sup>H</sup>* (*v*(*x*)*<sup>T</sup>* ) and wt*<sup>H</sup>* (*v*(*x*) + *j*(*x*)) = wt*<sup>H</sup>* (*v*(*x*)*<sup>T</sup>* + *j*(*x*)) for some polynomial *v*(*x*). This means that given an information vector, there always exist two distinct codewords of the same weight.

(iii) Let *G*1, given by (9.5a) with *R* = *I <sup>p</sup>* + *Q*, and *G*2, given by (9.18), be two fullrank generator matrices with pairwise disjoint information sets, of pure FSD double-circulant codes for *p* ≡ −3 (mod 8).

Given *u*1(*x*) as input, we have a codeword *c*1(*x*) = (*u*1(*x*)|*v*1(*x*)), where *v*1(*x*) = *u*1(*x*)(1 + *q*(*x*)), from *G*<sup>1</sup> and another codeword *c*2(*x*) = (*v*2(*x*)|*u*<sup>2</sup> (*x*)), where *u*2(*x*) = *u*1(*x*)<sup>2</sup> and *v*2(*x*) = *u*1(*x*)<sup>2</sup>(1 + *n*(*x*)) = *u*1(*x*)<sup>2</sup>(1 +*q*(*x*))<sup>2</sup> = *v*1(*x*)2, from *G*2. Since the weight of a polynomial and that of its square are the same over F2, the proof follows.

	- a. ε = 0 and wt*<sup>H</sup>* (*e*(*x*))is odd, we have a codeword *c*1(*x*)= (0 | *e*(*x*) | 1 | *f* (*x*)) from *G*1. Applying 0 | *e*(*x*)<sup>2</sup> as an information vector to *G*2, we have another codeword *c*2(*x*) = 1 | *e*(*x*)<sup>2</sup>*n*(*x*) | 0 | *e*(*x*)<sup>2</sup> . Since *e*(*x*)<sup>2</sup>*n*(*x*) = *e*(*x*)<sup>2</sup>*b*(*x*)<sup>2</sup> = *f* (*x*)2, the codeword *c*<sup>2</sup> = 1 | *f* (*x*)<sup>2</sup> | 0 | *e*(*x*)<sup>2</sup> .
	- b. ε = 1 and wt*<sup>H</sup>* (*e*(*x*)) is odd, *G*<sup>1</sup> produces *c*1(*x*) = (1 | *e*(*x*) | 1 | *f* (*x*) + *j*(*x*)). Applying 1 | *e*(*x*)<sup>2</sup> as an information vector to *G*2, we have a codeword *c*2(*x*) = 1 | *e*(*x*)<sup>2</sup>*n*(*x*) + *j*(*x*) | 1 | *e*(*x*)<sup>2</sup> = 1 | *f* (*x*)<sup>2</sup> + *j*(*x*) | 1 | *e*(*x*)<sup>2</sup> .
	- c. ε = 0 and wt*<sup>H</sup>* (*e*(*x*)) is even, *G*<sup>1</sup> produces a codeword *c*1(*x*) = (0 | *e*(*x*) | 0 | *f* (*x*)). Applying 0 | *e*(*x*)<sup>2</sup> as an information vector to *G*2, we have another codeword *c*2(*x*) = 0 | *e*(*x*)<sup>2</sup>*n*(*x*) | 0 | *e*(*x*)<sup>2</sup> = 0 | *f* (*x*)<sup>2</sup> | 0 | *e*(*x*)<sup>2</sup> .
	- d. ε = 1 and wt*<sup>H</sup>* (*e*(*x*)) is even, *G*<sup>1</sup> produces *c*1(*x*) = (1 | *e*(*x*) | 0 | *f* (*x*) + *j*(*x*)). Applying 1 | *e*(*x*)<sup>2</sup> as an information vector to *G*2, we have a codeword *c*2(*x*) = 0 | *e*(*x*)<sup>2</sup>*n*(*x*) + *j*(*x*) | 1 | *e*(*x*)<sup>2</sup> = 0 | *f* (*x*)<sup>2</sup> + *j*(*x*) | 1 | *e*(*x*)<sup>2</sup> .

It is clear that in all cases, wt*<sup>H</sup>* (*c*1(*x*)) = wt*<sup>H</sup>* (*c*2(*x*)) since wt*<sup>H</sup>* (*v*(*x*)) = wt*<sup>H</sup>* (*v*(*x*)<sup>2</sup>) and wt*<sup>H</sup>* (*v*(*x*) + *j*(*x*)) = wt*<sup>H</sup>* (*v*(*x*)<sup>2</sup> + *j*(*x*)) for some polynomial *v*(*x*). This means that given an information vector, there always exist two distinct codewords of the same weight.

From Lemma 9.7, it follows that, in order to count the number of codewords of weight *w*, we only require

$$\sum\_{i=1}^{\mathbb{W}/2} \binom{k}{i}$$

codewords to be enumerated and if *Aw* denotes the number of codewords of weight *w*,

230 9 Algebraic Quasi Cyclic Codes

$$A\_{\le} = a\_{\le/2} + 2\sum\_{i=1}^{\le/2-1} a\_i \tag{9.26}$$

where *ai* is the number of weight *w* codewords which have *i* non-zeros in the first *k* coordinates.

Similarly, the commonly used method to compute the minimum distance of halfrate codes with two full-rank generator matrices of mutually disjoint information sets, for example, see van Dijk et al. [18], assuming that *d* is the minimum distance of the code, requires as many as

$$S = 2\sum\_{i=1}^{d/2-1} \binom{n}{i}$$

codewords to be enumerated. Following Lemma 9.7, only *S*/2 codewords are required for *P<sup>p</sup>* and *B<sup>p</sup>* for *p* ≡ −3 (mod 8), and self-dual double-circulant codes. Note that the bound *d*/2 − 1 may be improved for singly even and doubly even codes, but we consider the general case here.

#### **9.6 Weight Distributions**

The automorphism group of both (*<sup>p</sup>* <sup>+</sup> <sup>1</sup>, <sup>1</sup> <sup>2</sup> (*p* + 1), *d*) extended QR and (2(*p* + 1), *p* + 1, *d*) quadratic double-circulant codes contains the projective special linear group, PSL2(*p*). Let *H* be a subgroup of the automorphism group of a linear code, and the number of codewords of weight *i*, denoted by *Ai* , can be categorised into two classes:


Thus, we can write *Ai* in terms of congruence as follows:

$$\begin{split} A\_i &= n\_i \times |\mathcal{J}\ell| + A\_i(\mathcal{J}\ell), \\ &\equiv A\_i(\mathcal{J}\ell) \pmod{|\mathcal{J}\ell|} \end{split} \tag{9.27}$$

where *Ai*(*H* ) is the number of codewords of weight *i* fixed by some element of *H* . This was originally shown by Mykkeltveit et al. [14], where it was applied to extended QR codes for primes 97 and 103.

# *9.6.1 The Number of Codewords of a Given Weight in Quadratic Double-Circulant Codes*

For *<sup>B</sup>p*, we shall choose *<sup>H</sup>* <sup>=</sup> PSL2(*p*), which has order <sup>|</sup>*<sup>H</sup>* | = <sup>1</sup> <sup>2</sup> *p*(*p*<sup>2</sup> − 1). Let the matrix ! *a b c d* " represent an element of PSL2(*p*), see (9.23). Since |*H* | can be factorised as |*H* | = # *<sup>j</sup> <sup>q</sup><sup>e</sup> <sup>j</sup> <sup>j</sup>* , where *qj* is a prime and *e <sup>j</sup>* is some integer, *Ai*(*H* ) (mod |*H* |) can be obtained by applying the Chinese remainder theorem to *Ai*(*Sqj*) (mod *q<sup>e</sup> <sup>j</sup> <sup>j</sup>* ) for all *qj* that divides |*H* |, where *Sqj* is the Sylow-*qj*-subgroup of *H* . In order to compute *Ai*(*Sqj*), a subcode of *B<sup>p</sup>* which is invariant under *Sqj* needs to be obtained in the first place. This invariant subcode, in general, has a considerably smaller dimension than *Bp*, and hence, its weight distribution can be easily obtained.

For each odd prime *qj* , *Sqj* is a cyclic group which can be generated by some ! *a b c d* " ∈ PSL2(*p*) of order *qj* . Because *Sqj* is cyclic, it is straightforward to obtain the invariant subcode, from which we can compute *Ai*(*Sqj*).

On the other hand, the case of *qj* = 2 is more complicated. For *qj* = 2, *S*<sup>2</sup> is a dihedral group of order 2*<sup>m</sup>*+1, where *m* + 1 is the maximum power of 2 that divides |*H* | [**?** ]. For *p* = 8*m* ± 3, we know that

$$|\mathcal{H}'| = \frac{1}{2}(8m \pm 3)\left((8m \pm 3)^2 - 1\right) = 2^2\left(64m^3 \pm 72m^2 + 26m \pm 3\right), \dots$$

which shows that the highest power of 2 that divides |*H* | is 2<sup>2</sup> (*m* = 1). Following [**?** ], there are 2*<sup>m</sup>* + 1 subgroups of order 2 in *S*2, namely

$$\begin{aligned} H\_2 &= \{1, P\}, \\ G\_2^0 &= \{1, T\}, \text{ and} \\ G\_2^1 &= \{1, PT\}, \end{aligned}$$

where *P*, *T* ∈ PSL2(*p*), *P*<sup>2</sup> = *T* <sup>2</sup> = 1 and *T PT* <sup>−</sup><sup>1</sup> = *P*−1.

Let *<sup>T</sup>* <sup>=</sup> ! <sup>0</sup> *<sup>p</sup>*−<sup>1</sup> 1 0 " , which has order 2. It can be shown that any order 2 permutation, *P* = ! *a b c d* " , if a constraint *b* = *c* is imposed, we have *a* = −*d*. All these subgroups, however, are conjugates in PSL2(*p*) [**?** ] and therefore, the subcodes fixed by *G*<sup>0</sup> 2, *G*<sup>1</sup> 2 and *H*<sup>2</sup> have identical weight distributions and considering any of them, say *G*<sup>0</sup> 2, is sufficient.

Apart from 2*<sup>m</sup>* + 1 subgroups of order 2, *S*<sup>2</sup> also contains a cyclic subgroup of order 4, 2*<sup>m</sup>*−<sup>1</sup> non-cyclic subgroups of order 4, and subgroups of order 2*<sup>j</sup>* for *j* ≥ 3.

Following [14], only the subgroups of order 2 and the non-cyclic subgroups of order 4 make contributions towards *Ai*(*S*2). For *p* ≡ ±3 (mod 8), there is only one non-cyclic subgroup of order 4, denoted by *G*4, which contains, apart from an identity, three permutations of order 2 [**?** ], i.e. a Klein 4 group,

$$G\_4 = \{1, P, T, PT\}.$$

Having obtained *Ai*(*G*<sup>0</sup> <sup>2</sup>) and *Ai*(*G*4), following the argument in [14], the number of codewords of weight *i* that are fixed by some element of *S*<sup>2</sup> is given by

$$A\_i(S\_2) \equiv \Im A\_i(G\_2^0) - 2A\_i(G\_4) \pmod{4}.\tag{9.28}$$

In summary, in order to deduce the modular congruence of the number of weight *i* codewords in *Bp*, it is sufficient to do the following steps:


Given *B<sup>p</sup>* and an element of PSL2(*p*), how can we find the subcode consisting of the codewords fixed by this element? Assume that *Z* = ! *a b c d* " ∈ PSL2(*p*) of prime order. Let *cli* (resp. *cri*) and *cli* (resp. *cri*) denote the *i*th coordinate and π*<sup>Z</sup>* (*i*)th coordinate (*i*th coordinate with the respect to permutation π*<sup>Z</sup>* ), in the left (resp. right) circulant form, respectively. The invariant subcode can be obtained by solving a set of linear equations consisting of the parity-check matrix of *B<sup>p</sup>* (denoted by *H*), *cli* + *cli* = 0 (denoted by *π<sup>Z</sup>* (*L*)) and *cri* + *cri* = 0 (denoted by *π<sup>Z</sup>* (*R*)) for all *<sup>i</sup>* <sup>∈</sup> <sup>F</sup>*<sup>p</sup>* ∪ {∞}, i.e.

The solution to *Hsub* is a matrix of rank *r* > (*p* + 1), which is the parity-check matrix of the (2(*p* + 1), 2(*p* + 1) − *r*, *d* ) invariant subcode. For subgroup *G*4, which consists of permutations *P*, *T* and *PT* , we need to solve the following matrix


to obtain the invariant subcode. Note that the parity-check matrix of *B<sup>p</sup>* is assumed to have the following form:

$$\mathbf{H} = \begin{array}{c|c|c|c} \hline I\_{\infty} \ l\_{0} & \dots \ l\_{p-1} \ r\_{\infty} \ r\_{0} & \dots \ r\_{p-1} \\ \hline 0 & & 1 \\ \vdots & & \vdots \\ 0 & & & 1 \\ \hline 1 & 1 & \dots & 1 & 0 & 0 & \dots & 0 \\ \hline \end{array} \tag{9.29}$$

One useful application of the modular congruence of the number of codewords of weight *w* is to verify, independently, the number of codewords of a given weight *w* that were computed exhaustively.

Computing the number of codewords of a given weight in small codes using a single-threaded algorithm is tractable, but for longer codes, it is necessary to use multiple computers working in parallel to produce a result within a reasonable time. Even so it can take several weeks, using hundreds of computers, to evaluate a long code. In order to do the splitting, the codeword enumeration task is distributed among all of the computers and each computer just needs to evaluate a predetermined number of codewords, finding the partial weight distributions. In the end, the results are combined to give the total number of codewords of a given weight. There is always the possibility of software bugs or mistakes to be made, particularly in any parallel computing scheme. The splitting may not be done correctly or double-counting or miscounting introduced as a result, apart from possible errors in combining the partial results. Fortunately, the modular congruence approach can also provide detection of computing errors by revealing inconsistencies in the summed results. The importance of this facet of modular congruence will be demonstrated in determining the weight distributions of extended QR codes in Sect. 9.6.2. In the following examples we work through the application of the modular congruence technique in evaluating the weight distributions of the quadratic double-circulant codes of primes 37 and 83.

*Example 9.1* For prime 37, there exists an FSD (76, 38, 12) quadratic doublecirculant code, *B*37. The weight enumerator of an FSD code is given by Gleason's theorem [15]

$$A(z) = \sum\_{i=0}^{\lfloor \frac{n}{8} \rfloor} K\_i (1+z^2)^{\frac{4}{5}-4i} (z^2 - 2z^4 + z^6)^i \tag{9.30}$$

for integers *Ki* . The number of codewords of any weight *w* is given by the coefficient of *z<sup>w</sup>* of *A*(*z*). In order to compute *A*(*z*) of *B*37, we need only to compute *A*2*<sup>i</sup>* for 6 ≤ *i* ≤ 9. Using the technique described in Sect. 9.5, the number of codewords of desired weights is obtained and then substituted into (9.30). The resulting weight enumerator function giving the whole weight distribution of the (76, 38, 12) code, *B*<sup>37</sup> is

$$\begin{split} A(z) &= \left(1 + z^{76}\right) + 2109 \times \left(z^{12} + z^{64}\right) + \\ &86469 \times \left(z^{16} + z^{60}\right) + 961704 \times \left(z^{18} + z^{38}\right) + \\ &7489059 \times \left(z^{20} + z^{56}\right) + 53574224 \times \left(z^{22} + z^{54}\right) + \\ &275509215 \times \left(z^{24} + z^{52}\right) + 1113906312 \times \left(z^{26} + z^{50}\right) + \\ &3626095793 \times \left(z^{28} + z^{48}\right) + 9404812736 \times \left(z^{30} + z^{46}\right) + \\ &19610283420 \times \left(z^{32} + z^{44}\right) + 33067534032 \times \left(z^{34} + z^{42}\right) + \\ &45200010670 \times \left(z^{36} + z^{40}\right) + 50157375456 \times z^{38}.\end{split}$$

Let *H* = PSL2(37), and we know that |*H* | = 2<sup>2</sup> × 32 × 19 × 37 = 25308. Consider the odd primes as factors *q*. For *q* = 3, ! 0 1 36 1 " generates the following permutation of order 3:

$$(\infty, 0, 1)(2, 36, 19)(3, 18, 13)(4, 12, 10)(5, 9, 23)(6, 22, 7)(8, 21, 24)$$
 
$$(11)(14, 17, 30)(15, 29, 33)(16, 32, 31)(20, 35, 25)(26, 34, 28)(27)$$

The corresponding invariant subcode has a generator matrix *G*(*S*3) of dimension 14, which is given by

*<sup>G</sup>*(*S*3) <sup>=</sup> ⎡ ⎢ ⎢ ⎢ ⎢ ⎣ <sup>1000000000000011101110011111101001001011</sup> <sup>001010001001110001000100000000000000</sup> <sup>0100000000000000001001100</sup> <sup>01011101100010001011100110</sup> <sup>1110000001000000000100000</sup> <sup>001000000000001110101110101</sup> <sup>01100110101101010010000111</sup> <sup>10100000001000000000000</sup> <sup>0001000000000011100111111111</sup> <sup>000100001111010100</sup> <sup>110101110010000010000000000000</sup> <sup>000010000000001110101111101010100101</sup> <sup>1111111110001111110000000000000000000000</sup> <sup>0000010000000011111111111111111111111111</sup> <sup>111111111110000000000000000000000000</sup> <sup>000000100000001110111000</sup> <sup>11111111110100101010111110</sup> <sup>11110000000000000000000000</sup> <sup>000000010000000000011001110</sup> <sup>10001000110111010001100</sup> <sup>11110000000000100100000000</sup> <sup>00000000100000111011011001011101110001</sup> <sup>00011101110101110000110000000000000000</sup> <sup>0000000001000011100101110111</sup> <sup>0001000111011101</sup> <sup>00110111110000000000001000000001</sup> <sup>0000000000100011100010001010011010010</sup> <sup>010111111001111110000000000010000000010</sup> <sup>0000000000010011100101100101011110000</sup> <sup>100011111111101110000000000000000001100</sup> <sup>00000000000010111010000100</sup> <sup>101011010110011010</sup> <sup>10111011110000000000000001010000</sup> <sup>000000000000010000111110111</sup> <sup>11011010101101111101111111</sup> <sup>11000000000000010000000</sup> ⎤ ⎥ ⎥ ⎥ ⎥ ⎦

and its weight enumerator function is

$$\begin{array}{c} A^{(\mathcal{S}\_{\mathbb{S}})}(z) = \left(1 + z^{\mathcal{7}6}\right) + 3 \times \left(z^{12} + z^{64}\right) + 24 \times \left(z^{16} + z^{60}\right) + \\ \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \begin{array}{c} 54 \times \left(z^{18} + z^{58}\right) + 150 \times \left(z^{20} + z^{56}\right) + 176 \times \left(z^{22} + z^{54}\right) + \\ 171 \times \left(z^{24} + z^{52}\right) + 468 \times \left(z^{26} + z^{50}\right) + 788 \times \left(z^{28} + z^{48}\right) + \\ 980 \times \left(z^{30} + z^{46}\right) + 1386 \times \left(z^{32} + z^{44}\right) + 1350 \times \left(z^{34} + z^{42}\right) + \\ 1573 \times \left(z^{36} + z^{40}\right) + 2136 \times z^{38} . \end{array} \right)$$

For *q* = 19, ! 0 1 36 3 " generates the following permutation of order 19:

$$\begin{aligned} (\infty, 0, 25, 5, 18, 32, 14, 10, 21, 2, 1, 19, 30, 26, 8, 22, 35, 15, 3) \\ (4, 36, 28, 34, 31, 33, 16, 17, 29, 27, 20, 13, 11, 23, 24, 7, 9, 6, 12) .\end{aligned}$$

The resulting generator matrix of the invariant subcode *G*(*S*19) , which has dimension 2, is

*<sup>G</sup>*(*S*19) <sup>=</sup> [ <sup>1011111111111111111111111111111111111110000</sup> <sup>000000000000000000000000000000000</sup> <sup>01000000000000000000000000000000000000011</sup> <sup>11111111111111111111111111111111111</sup> ]

and its weight enumerator function is

$$A^{(\mathcal{S}\_{\mathbb{I}^9})}(z) = 1 + 2z^{\mathcal{S}8} + z^{\mathcal{T}6}.\tag{9.33}$$

For the last odd prime, *q* = 37, a permutation of order 37

$$\begin{aligned} &(\infty, 0, 18, 24, 27, 14, 30, 15, 13, 32, 25, 26, 33, 19, 7, 4, 6, 23, 34, 4, 5, 6, 10, 12, 20, 20, 20, 21, 22, 20, 5, 21, 8, 11, 17, 35) (36) \end{aligned}$$

is generated by ! 0 1 36 35 " and it turns out that the corresponding invariant subcode, and hence, the weight enumerator function, are identical to those of *q* = 19.

For *q* = 2, subcodes fixed by some element of *G*<sup>0</sup> <sup>2</sup> and *G*<sup>4</sup> are required. We have *P* = ! 3 8 8 34 " and *T* = ! 0 36 1 0 " , and the resulting order 2 permutations generated by *P*, *T* and *PT* are

$$(\infty, 5)(0, 22)(1, 17)(2, 21)(3, 29)(4, 16)(6, 31)(7, 18)(8, 26)(9, 30)(10, 25)$$

$$(11, 34)(12, 14)(13, 36)(15)(19, 28)(20, 24)(23, 27)(32)(33, 35)$$

$$(\infty,0)(1,36)(2,18)(3,12)(4,9)(5,22)(6)(7,21)(8,23)(10,11)(13,17)$$

$$(14,29)(15,32)(16,30)(19,35)(20,24)(25,34)(26,27)(28,33)(31)$$

and

$$(\infty, 22)(0, 5)(1, 13)(2, 7)(3, 14)(4, 30)(6, 31)(8, 27)(9, 16)(10, 34)(11, 25)$$

$$(12, 29)(15, 32)(17, 36)(18, 21)(19, 33)(20)(23, 26)(24)(28, 35)$$

respectively. It follows that the corresponding generator matrices and weight enumerator functions of the invariant subcodes are

,

#### which has dimension 20, with

$$\begin{aligned} A^{(\mathcal{G}^1)}(z) &= \left(1 + z^{76}\right) + 21 \times \left(z^{12} + z^{64}\right) + 153 \times \left(z^{16} + z^{60}\right) + \\ &744 \times \left(z^{18} + z^{88}\right) + 1883 \times \left(z^{20} + z^{86}\right) + 4472 \times \left(z^{22} + z^{84}\right) + \\ &10119 \times \left(z^{24} + z^{52}\right) + 21000 \times \left(z^{26} + z^{80}\right) + 36885 \times \left(z^{28} + z^{48}\right) + \\ &58656 \times \left(z^{30} + z^{46}\right) + 85548 \times \left(z^{32} + z^{44}\right) + 108816 \times \left(z^{34} + z^{42}\right) + \\ &127534 \times \left(z^{36} + z^{40}\right) + 136912 \times z^{38} \end{aligned}$$

and

*<sup>G</sup>*(*G*4) <sup>=</sup> ⎡ ⎢ ⎢ ⎣ <sup>1000000000001101101011011101</sup> <sup>0001101110111110001110001</sup> <sup>00001000000000010000000</sup> <sup>010000000000000001010011100011</sup> <sup>000100011000011101011000</sup> <sup>0100000000000000000000</sup> <sup>00100000000011011010100001010001001100000</sup> <sup>01000000000000000100000000000010100</sup> <sup>00010000000011111111111111111111111111111111111</sup> <sup>11000000000000000000000000000</sup> <sup>000010000000000001010010000011</sup> <sup>001100010001011110100001</sup> <sup>0000000010100000000000</sup> <sup>0000010000001101101111000101</sup> <sup>10010111110110101100000000</sup> <sup>1000010000001000000000</sup> <sup>0000001000001100101101011101</sup> <sup>1000000110111010110100000</sup> <sup>00010000000000000000000</sup> <sup>00000001000000011000110001</sup> <sup>01100111101101111001101</sup> <sup>000000000000000010000000000</sup> <sup>000000001000000011000011110</sup> <sup>1110010000010011101111000000000000000000000100000</sup> <sup>0000000001001100001101</sup> <sup>011000100001011111100011010</sup> <sup>000000000001100000000001000</sup> <sup>0000000000100000100101000101</sup> <sup>100011001101111011101000</sup> <sup>000000000000000001000011</sup> <sup>0000000000011100111000111101010010010010</sup> <sup>011100111001000000000001000100000000</sup> ⎤ ⎥ ⎥ ⎦ ,

which has dimension 12, with

$$\begin{aligned} A^{(G\_4)}(z) &= \left(1 + z^{76}\right) + 3 \times \left(z^{12} + z^{64}\right) + 11 \times \left(z^{16} + z^{60}\right) + \\ &20 \times \left(z^{18} + z^{58}\right) + 51 \times \left(z^{20} + z^{56}\right) + 56 \times \left(z^{22} + z^{54}\right) + \\ &111 \times \left(z^{24} + z^{52}\right) + 164 \times \left(z^{26} + z^{90}\right) + 187 \times \left(z^{28} + z^{48}\right) + (9.35) \\ &224 \times \left(z^{30} + z^{46}\right) + 294 \times \left(z^{32} + z^{44}\right) + 328 \times \left(z^{34} + z^{42}\right) + \\ &366 \times \left(z^{36} + z^{40}\right) + 464 \times z^{38} \end{aligned}$$

respectively. Consider the number of codewords of weight 12, from (9.31)−(9.35), we know that *A*12(*G*<sup>0</sup> <sup>2</sup>) = 21 and *A*12(*G*4) = 3; applying (9.28),

$$A\_{12}(\mathbb{S}\_2) \equiv \mathfrak{Z} \times 2\mathbb{I} - 2 \times \mathfrak{Z} \pmod{4} \equiv \mathbb{I} \pmod{4}.$$

and thus, we have the following set of simultaneous congruences:

$$\begin{aligned} A\_{12}(S\_2) &\equiv 1 \pmod{2^2} \\ A\_{12}(S\_3) &\equiv 3 \pmod{3^2} \\ A\_{12}(S\_{19}) &\equiv 0 \pmod{19} \\ A\_{12}(S\_{37}) &\equiv 0 \pmod{37} .\end{aligned}$$

Following the Chinese remainder theorem, a solution to the above congruences, denoted by *A*12(*H* ), is congruent modulo LCM{2<sup>2</sup>, 32, 19, 37}, where LCM{2<sup>2</sup>, 32, 19, 37} is the least common multiple of the moduli 22, 32, 19 and 37, which is equal to 2<sup>2</sup> × 32 × 19 × 37 = 25308 in this case. Since these moduli are pairwise coprime, by the extended Euclidean algorithm, we can write

$$\begin{aligned} 1 &= 4 \times 1582 + \frac{25308}{4} \times (-1) \\ 1 &= 9 \times 625 + \frac{25308}{9} \times (-2) \\ 1 &= 19 \times 631 + \frac{25308}{19} \times (-9) \\ 1 &= 37 \times 37 + \frac{25308}{37} \times (-2) .\end{aligned}$$

A solution to the congruences above is given by

$$\begin{aligned} A\_{12}(\mathcal{AY}) &= 1 \times \left[ (-1) \frac{25308}{4} \right] + 3 \times \left[ (-2) \frac{25308}{9} \right] + 0 \times \left[ (-9) \frac{25308}{19} \right] \\ &+ 0 \times \left[ (-2) \frac{25308}{37} \right] \pmod{25308} \\ &= -1 \times 6327 + -6 \times 2812 \pmod{25308} \\ &= 2109 \pmod{25308} \\ &= 25308 n\_{12} + 2109. \end{aligned}$$

Referring to the weight enumerator function, (9.31), we can immediately see that *n*<sup>12</sup> = 0, indicating that *A*<sup>12</sup> has been accurately evaluated. Repeating the above procedures for weights larger than 12, we have Table 9.3 which shows that the weight distributions of *B*<sup>37</sup> are indeed accurate. In fact, since the complete weight distrib-


**Table 9.3** Modular congruence weight distributions of *B*<sup>37</sup>

utions can be obtained once the first few terms required by Gleason's theorem are known, verification of these few terms is sufficient.

*Example 9.2* Gulliver et al. [6] have shown that the (168, 84, 24) doubly even selfdual quadratic double-circulant code *B*<sup>83</sup> is not extremal since it has minimum distance less than or equal to 28. The weight enumerator of a Type-II code of length *n* is given by Gleason's theorem, which is expressed as [15]

$$A(z) = \sum\_{i=0}^{\lfloor n/24 \rfloor} K\_i (1 + 14z^4 + z^8)^{\frac{n}{k} - 3i} \{z^4 (1 - z^4)^4\}^i,\tag{9.36}$$

where *Ki* are some integers. As shown by (9.36), only the first few terms of *Ai* are required in order to completely determine the weight distribution of a Type-II code. For *B*83, only the first eight terms of *Ai* are required. Using the parallel version of the efficient codeword enumeration method described in Chap. 5, Sect. 9.5, we determined that all of these eight terms are 0 apart from *A*<sup>0</sup> = 1, *A*<sup>24</sup> = 571704 and *A*<sup>28</sup> = 17008194.

We need to verify independently whether or not *A*<sup>24</sup> and *A*<sup>28</sup> have been correctly evaluated. As in the previous example, the modular congruence method can be used for this purpose. For *p* = 83, we have |*H* | = 2<sup>2</sup> × 3 × 7 × 41 × 83 = 285852. We will consider the odd prime cases in the first place.

For prime *q* = 3, a cyclic group of order 3, *S*<sup>3</sup> can be generated by ! 0 1 82 1 " ∈ PSL2(83), and we found that the subcode invariant under *S*<sup>3</sup> has dimension 28 and has 63 and 0 codewords of weights 24 and 28, respectively.

For prime *q* = 7, we have ! 0 1 82 10 " which generates *S*7. The subcode fixed by *S*<sup>7</sup> has dimension 12 and no codewords of weight 24 or 28 are contained in this subcode.

Similarly, for prime *q* = 41, the subcode fixed by *S*41, which is generated by ! 0 1 82 4 " and has dimension 4, contains no codewords of weight 24 or 28.

Finally, for prime *q* = 83, the invariant subcode of dimension 2 contains the allzeros, the all-ones, {0, <sup>0</sup>,..., <sup>0</sup>, <sup>0</sup> <sup>84</sup> , 1, 1,..., 1, 1 <sup>84</sup> } and {1, <sup>1</sup>,..., <sup>1</sup>, <sup>1</sup> <sup>84</sup> , 0, 0,..., 0, 0 <sup>84</sup> } codewords only. The cyclic group *S*<sup>83</sup> is generated by ! 0 1 82 81 " .

For the case of *q* = 2, we have *P* = ! 1 9 9 82 " and *T* = ! 0 82 1 0 " . The subcode fixed by *S*2, which has dimension 42, contains 196 and 1050 codewords of weights 24 and 28, respectively. Meanwhile, the subcode fixed by *G*4, which has dimension 22, contains 4 and 6 codewords of weights 24 and 28, respectively.

Thus, using (9.28), the numbers of codewords of weights 24 and 28 fixed by *S*<sup>2</sup> are

$$\begin{aligned} A\_{24}(S\_2) &= 3 \times 196 - 2 \times 4 \equiv 0 \pmod{4}, \text{ and} \\ A\_{28}(S\_2) &= 3 \times 1050 - 2 \times 6 \equiv 2 \pmod{4} \end{aligned}$$

and by applying the Chinese remainder theorem to all *Ai*(*Sq* ) for *i* = 24, 28, we arrive at

$$A\_{24} = n\_{24} \times 285852\tag{9.37a}$$

and

$$A\_{28} = n\_{28} \times 285852 + 142926 \,. \tag{9.37b}$$

From (9.37) we have now verified *A*<sup>24</sup> and *A*28, since they have equality for nonnegative integers *n*<sup>24</sup> and *n*<sup>28</sup> (*n*<sup>24</sup> = 2 and *n*<sup>28</sup> = 59). Using Gleason's theorem, i.e. (9.36), the weight enumerator function of the (168, 84, 24) code *B*<sup>83</sup> is obtained and it is given by

$$\begin{aligned} A(z) &= (z^0 + z^{168}) + \\ &571704 \times (z^{24} + z^{144}) + \\ &17008194 \times (z^{23} + z^{140}) + \\ &5805701484 \times (z^{23} + z^{136}) + \\ &125261575636 \times (z^{36} + z^{132}) + \\ &160668282611929 \times (z^{40} + z^{128}) + \\ &13047194638256310 \times (z^{44} + z^{124}) + \\ &629048483051034984 \times (z^{48} + z^{120}) + \\ &1908712980855686056 \times (z^{52} + z^{116}) + \\ &37209697089301086600 \times (z^{56} + z^{112}) + \\ &473921490433882602066 \times (z^{66} + z^{108}) + \\ &39973673426117366814414 \times (z^{48} + z^{108}) + \\ &2256966751178950020072 \times (z^{68} + z^{100}) + \\ &86021109321000217491044 \times (z^{72} + z^{26}) + \\ &227230689293966645002066 \times (z^{78} + z^{28}) + \\ &393509958727966855449910376 \times (z^{80} + z^{88}) + \\ &47587474111704680343205104 \times \hat{z}^{84}.\end{aligned}$$

For the complete weight distributions and their congruences of the (2(*p* + 1), *p* + 1, *d*) quadratic double-circulant codes, for 11 ≤ *p* ≤ 83, except *p* = 37 as it has already been given in Example 9.1, refer to Appendix "Weight Distributions of Quadratic Double-Circulant Codes and their Modulo Congruence".

# *9.6.2 The Number of Codewords of a Given Weight in Extended Quadratic Residue Codes*

We have modified the modular congruence approach of Mykkeltveit et al. [14], which was originally introduced for extended QR codes *L*ˆ *<sup>p</sup>*, so that it is applicable to the quadratic double-circulant codes. Whilst *B<sup>p</sup>* contains one non-cyclic subgroup of order 4, *L*ˆ *<sup>p</sup>* contains two distinct non-cyclic subgroups of this order, namely *G*<sup>0</sup> <sup>4</sup> and *G*<sup>1</sup> 4. As a consequence, (9.28) becomes

$$A\_i(S\_2) \equiv (2^m + 1)A\_i(H\_2) - 2^{m-1}A\_i(G\_4^0) - 2^{m-1}A\_i(G\_4^1) \pmod{2^{m+1}},\tag{9.39}$$

where 2*<sup>m</sup>*+<sup>1</sup> is the highest power of 2 that divides |*H* |. Unlike *Bp*, where there are two circulants in which each one is fixed by PSL2(*p*), a linear group PSL2(*p*) acts on the entire coordinates of *L*ˆ *<sup>p</sup>*. In order to obtain the invariant subcode, we only need a set of linear equations containing the parity-check matrix of *L*ˆ *<sup>p</sup>*, which is arranged in (0, <sup>1</sup>,..., *<sup>p</sup>* <sup>−</sup> <sup>2</sup>, *<sup>p</sup>* <sup>−</sup> <sup>1</sup>)(∞) order, and *ci* <sup>+</sup> *ci* <sup>=</sup> 0 for all *<sup>i</sup>* <sup>∈</sup> <sup>F</sup>*<sup>p</sup>* ∪ {∞}. Note that *ci* and *ci* are defined in the same manner as in Sect. 9.6.1.

We demonstrate the importance of this modular congruence approach by proving that the published results for the weight distributions of *L*ˆ <sup>151</sup> and *L*ˆ <sup>137</sup> are incorrect. However, first let us derive the weight distribution of *L*ˆ 167.

*Example 9.3* There exists an extended QR code *L*ˆ <sup>167</sup> which has identical parameters (*n* = 168, *k* = 84 and *d* = 24) as the code *B*83. Since *L*ˆ <sup>167</sup> can be put into doublecirculant form and it is Type-II self-dual, the algorithm in Sect. 9.5 can be used to compute the number of codewords of weights 24 and 28, denoted by *A* <sup>24</sup> and *A* <sup>28</sup> for convenience, from which we can use Gleason's theorem (9.36) to derive the weight enumerator function of the code, *A* (*z*). By codeword enumeration using multiple computers we found that

$$\begin{aligned} A'\_{24} &= 776216\\ A'\_{28} &= 18130188. \end{aligned} \tag{9.40}$$

In order to verify the accuracy of *A* <sup>24</sup> and *A* 28, the modular congruence method is used. In this case, we have Aut(*L*ˆ <sup>167</sup>) ⊇ *H* = PSL2(167). We also know that |PSL2(167)| = 2<sup>3</sup> × 3 × 7 × 83 × 167 = 2328648. Let *P* = ! 12 32 32 155 " and *<sup>T</sup>* <sup>=</sup> ! 0 166 1 0 " .

Let the permutations of orders 3, 7, 83 and 167 be generated by ! 0 1 166 1 " , ! 0 1 166 19 " , ! 0 1 166 4 " and ! 0 1 166 165 " , respectively. The numbers of codewords of weights 24 and 28 in the various invariant subcodes of dimension *k* are


For *L*ˆ 167, equation (9.39) becomes

$$A\_i(\mathbb{S}\_2) \equiv \mathfrak{F} \times A\_i(H\_2) - 2 \times A\_i(G\_4^0) - 2 \times A\_i(G\_4^1) \pmod{8} \,\,. \tag{9.41}$$

It follows that

$$\begin{aligned} A\_{24}(S\_2) &\equiv 0 \pmod{8}, \\ A\_{28}(S\_2) &\equiv 4 \pmod{8} \end{aligned}$$

and thus,

$$A\_{24}^{'} = n\_{24}^{'} \times 2328648 + 776216\tag{9.42a}$$

and

$$A'\_{28} = n'\_{28} \times 2328648 + 1829652\tag{9.42b}$$

from the Chinese remainder theorem.

From (9.37a) and (9.42a), we can see that *B*<sup>83</sup> and *L*ˆ <sup>167</sup> are indeed inequivalent. This is because for integers *n*24, *n* <sup>24</sup> ≥ 0, *A*<sup>24</sup> = *A* 24.

Comparing Eq. (9.40) with (9.42a) and (9.42b) establishes that *A* <sup>24</sup> = 776216 (*n* <sup>24</sup> = 0) and *A* <sup>28</sup> = 18130188 (*n* <sup>28</sup> = 7). The weight enumerator of *L*ˆ <sup>167</sup> is derived from (9.36) and it is given in (9.43). In comparison to (9.38), it may be seen that *L*ˆ <sup>167</sup> is a slightly inferior code than *B*<sup>83</sup> having more codewords of weights 24, 28 and 32.

$$\begin{aligned} A'(z) &= (z^0 + z^{168}) + \\ &776216 \times (z^{24} + z^{144}) + \\ &18130188 \times (z^{28} + z^{140}) + \\ &5550332508 \times (z^{32} + z^{136}) + \\ &1251282702264 \times (z^{36} + z^{132}) + \\ &166071600559137 \times (z^{40} + z^{128}) + \\ &13047136918828740 \times (z^{44} + z^{124}) + \\ &629048543890724216 \times (z^{48} + z^{120}) + \\ &19087130695796615088 \times (z^{48} + z^{116}) + \\ &372099690249351071112 \times (z^{56} + z^{112}) + \\ &4739291519495550245228 \times (z^{60} + z^{108}) + \\ &399736733375990380474086 \times (z^{64} + z^{104}) + \\ &225696677727188690570184 \times (z^{58} + z^{100}) + \end{aligned}$$

$$\begin{aligned} &860241108921860741947676 \times (z^{\uparrow 2} + z^{96}) + \\ &2227390683565491780127428 \times (z^{\uparrow 6} + z^{92}) + \\ &3935099586463594172460648 \times (z^{80} + z^{88}) + \\ &4755747412595715344169376 \times z^{84} \end{aligned} \tag{9.43}$$

*Example 9.4* Gaborit et al. [4] gave *A*2*<sup>i</sup>* , for 22 ≤ 2*i* ≤ 32, of *L*ˆ <sup>137</sup> and we will check the consistency of the published results. For *p* = 137, we have |PSL2(137)| = 2<sup>3</sup> × 3 × 17 × 23 × 137 = 1285608 and we need to compute *A*2*<sup>i</sup>*(*Sq* ), where 22 ≤ 2*i* ≤ 32, for all primes *q* dividing |PSL2(137)|. Let *P* = ! 137 51 51 1 " and *T* = ! 0 136 1 0 " .

Let ! 0 1 136 1 " , ! 0 1 136 6 " and ! 0 1 136 11 " be generators of permutation of orders 3, 17 and 23, respectively. It is not necessary to find a generator of permutation of order 137 as it fixes the all-zeros and all-ones codewords only. Subcodes that are invariant under *G*<sup>0</sup> 2, *G*<sup>0</sup> 4, *G*<sup>1</sup> 4, *S*3, *S*<sup>17</sup> and *S*<sup>23</sup> are obtained and the number of weight *i*, for 22 ≤ 2*i* ≤ 32, codewords in these subcodes is then computed. The results are shown as follows, where *k* denotes the dimension of the corresponding subcode,


We have

$$A\_i(S\_2) \equiv \mathbb{S} \times A\_i(H\_2) - 2 \times A\_i(G\_4^0) - 2 \times A\_i(G\_4^1) \pmod{8} \,,$$

for *L*ˆ 137, which is identical to that for *L*ˆ <sup>167</sup> since they both have 2<sup>3</sup> as the highest power of 2 that divides |*H* |. Using this formulation, we obtain

$$\begin{aligned} A\_{22}(S\_2) &= 2 \pmod{8} \\ A\_{24}(S\_2) &= 4 \pmod{8} \\ A\_{26}(S\_2) &= 6 \pmod{8} \\ A\_{28}(S\_2) &= 2 \pmod{8} \\ A\_{30}(S\_2) &= 0 \pmod{8} \\ A\_{32}(S\_2) &= 3 \pmod{8} \end{aligned}$$

and combining all the results using the Chinese remainder theorem, we arrive at

$$\begin{aligned} A\_{22} &= n\_{22} \times 1285608 + 321402 \\ A\_{24} &= n\_{24} \times 1285608 + 1071340 \\ A\_{26} &= n\_{26} \times 1285608 + 964206 \\ A\_{28} &= n\_{28} \times 1285608 + 321402 \\ A\_{30} &= n\_{30} \times 1285608 + 428536 \\ A\_{32} &= n\_{32} \times 1285608 + 1124907 \end{aligned} \tag{9.44}$$

for some non-negative integers *ni* . Comparing these to the results in [4], we can immediately see that *n*<sup>22</sup> = 0, *n*<sup>24</sup> = 1, *n*<sup>26</sup> = 16, *n*<sup>28</sup> = 381, and both *A*<sup>30</sup> and *A*<sup>32</sup> were incorrectly reported. By codeword enumeration using multiple computers in parallel, we have determined that

$$\begin{aligned} A\_{30} &= 6648307504 \\ A\_{32} &= 77865259035 \end{aligned}$$

hence, referring to (9.44) it is found that *n*<sup>30</sup> = 5171 and *n*<sup>32</sup> = 60566.

*Example 9.5* Gaborit et al. [4] also published the weight distribution of *L*ˆ <sup>151</sup> and we will show that this has also been incorrectly reported. For *L*ˆ 151, |PSL2(151)| = 2<sup>3</sup> × 3 × 52 × 19 × 151 = 1721400 and we have *P* = ! 104 31 31 47 " and *T* = ! 0 150 1 0 " .

Let! 0 1 150 1 " , ! 0 1 150 27 " and ! 0 1 150 8 " be generators of permutation of orders 3, 5 and 19, respectively. The numbers of weight *i* codewords for *i* = 20 and 24, in the various fixed subcodes of dimension *k*, are


and *Ai*(*S*2) is again the same as that for primes 167 and 137, see (9.41). Using this equation, we have *A*20(*S*2) = *A*24(*S*2) = 2 (mod 8). Following the Chinese remainder theorem, we obtain

$$\begin{aligned} A\_{20} &= n\_{20} \times 1721400 + 28690 \\ A\_{24} &= n\_{24} \times 1721400 + 717250 \end{aligned} \tag{9.45}$$

It follows that *A*<sup>20</sup> is correctly reported in [4], but *A*<sup>24</sup> is incorrectly reported as 717230. Using the method in Sect. 9.5 implemented on multiple computers, we have determined that

$$\begin{aligned} A\_{20} &= 28690 \\ A\_{24} &= 717250 \end{aligned}$$

hence *n*<sup>20</sup> = 0 and *n*<sup>24</sup> = 0 in (9.45). Since *A*<sup>20</sup> and *A*<sup>24</sup> are required to derive the complete weight distribution of *L*ˆ <sup>151</sup> according to Gleason's theorem for Type-II codes (9.36), the weight distribution of *L*ˆ <sup>151</sup> given in [4] is not correct. The correct weight distribution of this code, given in terms of the weight enumerator function, is

$$\begin{aligned} A(z) &= \left(z^0 + z^{12}\right) + \\ &28690 \times \left(z^{20} + z^{123}\right) + \\ &717250 \times \left(z^{24} + z^{128}\right) + \\ &164250250 \times \left(z^{28} + z^{124}\right) + \\ &39390351505 \times \left(z^{28} + z^{120}\right) + \\ &5498418962110 \times \left(z^{26} + z^{116}\right) + \\ &439930711621830 \times \left(z^{40} + z^{112}\right) + \\ &19714914846904500 \times \left(z^{44} + z^{108}\right) + \\ &542987434093298550 \times \left(z^{48} + z^{104}\right) + \\ &92223680169626658 \times \left(z^{22} + z^{100}\right) + \\ &99848872933173169615 \times \left(z^{86} + z^{96}\right) + \\ &670740325250798111830 \times \left(z^{66} + z^{22}\right) + \\ &2949674479653615754525 \times \left(z^{64} + z^{88}\right) + \\ &844602592483506824150 \times \left(z^{86} + z^{84}\right) + \\ &18840564760239283282420 \times \left(z^{72} + z^{80}\right) + \\ &19527364659006697265368 \times z^{76} \end{aligned}$$

# **9.7 Minimum Distance Evaluation: A Probabilistic Approach**

An interesting observation is that the minimum weight codewords of *L*ˆ *<sup>p</sup>*, for *p* ≡ ±1 (mod 8), and *Bp*, for *p* ≡ ±3 (mod 8) are always contained in one or more of their fixed subcodes. At least, this is true for all known cases (*n* ≤ 200) and this is depicted in Table 9.4. We can see that the subcode fixed by *H*<sup>2</sup> appears in all the known cases. In Table 9.4, the column *dU* denotes the minimum distance upper bound of extremal doubly even self-dual codes of a given length and the last column indicates the various subgroups whose fixed subcodes contain the minimum weight codewords. The highest *n*, for which the minimum distance of extended QR codes is known, is 168 [5] and we provide further results for *n* = 192, 194, and 200. We obtained the minimum distance of these extended QR codes using the parallel version of the minimum distance algorithm for cyclic codes (QR codes are cyclic) described in Chap. 5, Sect. 5.4. Note that the fact that the code is singly even (*n* = 194) or doubly


**Table 9.4** The minimum distance of *L*ˆ *<sup>p</sup>* and *B<sup>p</sup>* for 12 ≤ *n* ≤ 200

aExtended duadic code [12] has higher minimum distance

even (*n* = 192, 200) is also taken into account in order to reduce the number of codewords that need to be enumerated, see Chap. 5, Sects. 5.2.3 and 5.4. This code property is also taken into account for computing the minimum distance of *B<sup>p</sup>* using the method described in Sect. 9.5.

Based on the above observation, a probabilistic approach to minimum distance evaluation is developed. Given *L*ˆ *<sup>p</sup>* or *Bp*, the minimum distance of the code is upper bounded by

$$d \le \min\_{Z = \{G\_2^0, G\_4^0, G\_4^1, S\_{q\_1}, S\_{q\_2}, \ldots\}} \{d(Z)\}\,,\tag{9.47}$$


**Table 9.5** The minimum distance of *L*ˆ *<sup>p</sup>* and *B<sup>p</sup>* for 204 ≤ *n* ≤ 450

aExtended duadic code [12] has higher minimum distance bThe minimum distance of the subcode is computed probabilistically

where *d*(*Z*) is the minimum distance of the subcode fixed by *Z* ∈ PSL2(*p*) and *q* runs through all odd primes that divide |PSL2(*p*)|. Note that for *Bp*, *G*<sup>0</sup> = *G*<sup>1</sup> hence, only one is required. Using (9.47), we give an upper bound of the minimum distance of *L*ˆ *<sup>p</sup>* and *B<sup>p</sup>* for all codes where *n* ≤ 450 and this is tabulated in Table 9.5. The

**Fig. 9.1** Minimum distance and the extremal bound for distance of doubly even self-dual codes

various fixed subgroups where the minimum weight codewords are found are given in the last column of this table. As shown in Tables 9.4 and 9.5, there is no extremal extended QR or quadratic double-circulant codes for 136 < *n* ≤ 450 and we plot the minimum distance (or its upper bound for *n* > 200) against the extremal bound in Fig. 9.1. From this figure, it is obvious that, as the block length increases, the gap between the extremal bound and the minimum distance widens and it seems that longer block lengths will follow the same trend. Thus, we conjecture that *n* = 136 is the longest doubly even extremal self-dual double-circulant code. It is worth noting that, for extended QR codes, the results obtained using this probabilistic method are the same as those published by Leon [11].

#### **9.8 Conclusions**

Bordered double-circulant codes based on primes can be classified into two classes: (*p* + 1, (*p* + 1)/2, *d*) extended QR codes, for primes ±1 (mod 8), and (2(*p* + 1), *p* + 1, *d*) quadratic double-circulant codes, for primes ±3 (mod 8).

Whilst quadratic double-circulant codes always exist, given a prime *p* ≡ ±3 (mod 8), bordered double-circulant codes may not exist given a prime *p* ≡ ±1 (mod 8).

There always exist (2*p*, *p*, *d*) pure double-circulant codes for any prime *p* ≡ ±3 (mod 8).

For primes *p* ≡ −1, 3 (mod 8), the double-circulant codes are self-dual and for other primes, the double-circulant codes are formally self-dual.

By exploiting the code structure of formally self-dual, double-circulant codes for *p* ≡ −3 (mod 8) and also the self-dual double-circulant codes for both pure and bordered cases, we have shown that, compared to the standard method of evaluation, the number of codewords required to evaluate the minimum distance or to count the number of codewords of a given weight can be reduced by a factor of 2.

The automorphism group of the (*p* + 1, (*p* + 1)/2, *d*) extended QR code contains the projective special linear group PSL2(*p*) acting on the coordinates (∞)(0, 1,..., *p* − 2, *p* − 1).

The automorphism group of the (2(*p* + 1), *p* + 1, *d*) quadratic double-circulant code contains PSL2(*p*), acting on coordinates (∞)(0, 1,..., *p* − 2, *p* − 1), applied simultaneously to left and right circulants.

The number of codewords of weight *i* of prime-based double-circulant codes, denoted by *Ai* , can be written as *Ai* = *ni* × |PSL2(*p*)| + *Ai*(PSL2(*p*)) ≡ *Ai*(PSL2(*p*)) (mod |PSL2(*p*)|) where *Ai*(PSL2(*p*)) denotes the number of codewords of weight *i* that are fixed by some element of PSL2(*p*). This result was due to Mykkeltveit et al. [14] and was originally introduced for extended QR codes. We have shown in this chapter that, with some modifications, this modulo congruence method can also be applied to quadratic double-circulant codes.

The modulo congruence technique is found to be very useful in verifying the number of codewords of a given weight obtained exhaustively by computation. We have shown the usefulness of this method by providing corrections to mistakes in previously published results of the weight distributions of extended QR codes for primes 137 and 151.

The weight distribution of the (168, 84, 24) extended QR code, which was previously unknown, has been evaluated and presented above. There also exists a quadratic double-circulant code with identical parameters (*n*, *k* and *d*) and the weight distribution of this code has also been presented above. The (168, 84, 24) quadratic double-circulant code is a better code than the (168, 84, 24) extended QR code since it has less low-weight codewords. The usefulness of the modulo congruence method in checking weight distribution results has been demonstrated in verifying the correctness of the weight distributions of these two codes.

The weight enumerator polynomial of an extended QR code of prime *p*, denoted by *AL*<sup>ˆ</sup>(*z*), can be obtained using Gleason's theorem once the first few terms are known. Since PSL2(*p*) is doubly transitive [13], knowing *AL*<sup>ˆ</sup>(*z*) implies *A<sup>L</sup>* (*z*), the weight enumerator polynomial of the corresponding cyclic QR code, is also known, i.e.

$$A\_{\mathcal{Q}}(z) = A\_{\mathcal{\mathcal{Q}}}(z) + \frac{1-z}{p+1} A'\_{\mathcal{Q}}(z)$$

where *A L*ˆ(*z*) is the first derivative of *<sup>A</sup>L*<sup>ˆ</sup>(*z*) with the respect to *<sup>z</sup>* [19]. As a consequence, we have been able to evaluate the weight distributions of the QR codes for primes 151 and 167. These are tabulated in Appendix "Weight Distributions of Quadratic Residues Codes for Primes 151 and 167", Tables 9.19 and 9.20, respectively.

A new probabilistic method to obtain the minimum distance of double-circulant codes based on primes has been described. This probabilistic approach is based on the observation that the minimum weight codewords are always contained in one or more subcodes fixed by some element of PSL2(*p*). Using this approach, we conjecture that there are no extremal double-circulant self-dual codes longer than 136 and that this is the last extremal code to be found.

## **9.9 Summary**

In this chapter, self-dual and binary double-circulant codes based on primes have been described in detail. These binary codes are some of the most powerful codes known and as such form an important class of codes due to their powerful errorcorrecting capabilities and their rich mathematical structure. This structure enables the entire weight distribution of a code to be determined. With these properties, this family of codes has been a subject of extensive research for many years. For these codes that are longer than around 150 bits, an accurate determination of the codeword weight distributions has been an unsolved challenge. We have shown that the code structure may be used in a new algorithm that requires less codewords to be enumerated than traditional methods. As a consequence we have presented new weight distribution results for codes of length 152, 168, 192, 194 and 200. We have shown how a modular congruence method can be used to check weight distributions and have corrected some mistakes in previously published results for codes of lengths 137 and 151. For evaluation of the minimum Hamming distance for very long codes a new probabilistic method has been presented along with results for codes up to 450 bits long. It is conjectured that the (136, 68, 24) self-dual code is the longest extremal code, meeting the upper bound for minimum Hamming distance, and no other, longer, extremal code exists.

## **Appendix**

# **Circulant Analysis** *p* **= 11**

See Tables 9.6, 9.7 and 9.8.


Circulantanalysis11,*a*(*x*)1+*x*+*x*3,non-factorsof1+


**Table**



**Table 9.6**

(continued)


**Table**

(continued)


**Table 9.6**

(continued)




**Table 9.6**


*x*

*p*


# Circulant Analysis *p* = 11 259




**Table**





**Table**

(continued)


**Table 9.7**

**Table 9.8** Circulant analysis *<sup>p</sup>* <sup>=</sup> 11, *<sup>j</sup>*(*x*) <sup>=</sup> <sup>1</sup> <sup>+</sup> *<sup>x</sup>* <sup>+</sup> *<sup>x</sup>*<sup>2</sup> <sup>+</sup> *<sup>x</sup>*<sup>3</sup> <sup>+</sup> *<sup>x</sup>*<sup>4</sup> <sup>+</sup> *<sup>x</sup>*<sup>5</sup> <sup>+</sup> *<sup>x</sup>*<sup>6</sup> <sup>+</sup> *<sup>x</sup>*<sup>7</sup> <sup>+</sup> *<sup>x</sup>*<sup>8</sup> <sup>+</sup> *<sup>x</sup>*<sup>9</sup> <sup>+</sup> *<sup>x</sup>*10, factors of 1 <sup>+</sup> *<sup>x</sup> <sup>p</sup>*


# **Weight Distributions of Quadratic Double-Circulant Codes and their Modulo Congruence**

# *Primes* **+***3 Modulo 8*

#### **Prime 11**

We have *P* = ! 1 3 3 10 " and *T* = ! 0 10 1 0 " , *P*, *T* ∈ PSL2(11), and the permutations of order 3, 5 and 11 are generated by ! 0 1 10 1 " , ! 0 1 10 3 " and ! 0 1 10 9 " , respectively. In addition,

$$\text{PSL}\_2(11) = 2^2 \cdot 3 \cdot \text{S} \cdot 11 \cdot = 660$$

and the weight enumerator polynomials of the invariant subcodes are

$$\begin{split} &A\_{\mathcal{\mathcal{A}}\_{\parallel \mathbf{1}}}^{G\_2^{G}}(z) = \left(1 + z^{24}\right) + 15 \cdot \left(z^8 + z^{16}\right) + 32 \cdot z^{12} \\ &A\_{\mathcal{\mathcal{A}}\_{\parallel \mathbf{1}}}^{G\_4}(z) = \left(1 + z^{24}\right) + 3 \cdot \left(z^8 + z^{16}\right) + 8 \cdot z^{12} \\ &A\_{\mathcal{\mathcal{A}}\_{\parallel \mathbf{1}}}^{S\_{\mathbb{I}}}(z) = \left(1 + z^{24}\right) + 14 \cdot z^{12} \\ &A\_{\mathcal{\mathcal{A}}\_{\parallel \mathbf{1}}}^{S\_{\mathbb{I}}}(z) = \left(1 + z^{24}\right) + 4 \cdot \left(z^8 + z^{16}\right) + 6 \cdot z^{12} \\ &A\_{\mathcal{\mathcal{A}}\_{\parallel \mathbf{1}}}^{S\_{\mathbb{I}}}(z) = \left(1 + z^{24}\right) + 2 \cdot z^{12} \,. \end{split}$$

The weight distributions of *B*<sup>11</sup> and their modular congruence are shown in Table 9.9.


**Table 9.9** Modular congruence weight distributions of *B*<sup>11</sup>

#### **Prime 19**

We have *P* = ! 1 6 6 18 " and *T* = ! 0 18 1 0 " , *P*, *T* ∈ PSL2(19), and the permutations of order 3, 5 and 19 are generated by ! 0 1 18 1 " , ! 0 1 18 4 " and ! 0 1 18 17 " , respectively. In addition,

$$\text{PSL}\_2(19) = 2^2 \cdot 3^2 \cdot 5 \cdot 19 \cdot = 3420$$

and the weight enumerator polynomials of the invariant subcodes are

$$\begin{split} A\_{\mathcal{\mathcal{B}}\_{19}}^{(G\_{19}^{(1)})}(z) &= \left(1+z^{40}\right) + 5\cdot\left(z^8+z^{32}\right) + 80\cdot\left(z^{12}+z^{28}\right) + 250\cdot\left(z^{16}+z^{24}\right) + 352\cdot z^{20} \\ A\_{\mathcal{\mathcal{B}}\_{19}}^{(G\_{19})}(z) &= \left(1+z^{40}\right) + 1\cdot\left(z^8+z^{32}\right) + 8\cdot\left(z^{12}+z^{28}\right) + 14\cdot\left(z^{16}+z^{24}\right) + 16\cdot z^{20} \\ A\_{\mathcal{\mathcal{B}}\_{19}}^{(G\_{19})}(z) &= \left(1+z^{40}\right) + 6\cdot\left(z^8+z^{32}\right) + 22\cdot\left(z^{12}+z^{28}\right) + 57\cdot\left(z^{16}+z^{24}\right) + 84\cdot z^{20} \\ A\_{\mathcal{\mathcal{B}}\_{19}}^{(S\_{1})}(z) &= \left(1+z^{40}\right) + 14\cdot z^{20} \\ A\_{\mathcal{\mathcal{B}}\_{19}}^{(S\_{19})}(z) &= \left(1+z^{40}\right) + 2\cdot z^{20} .\end{split}$$

The weight distributions of *B*<sup>19</sup> and their modular congruence are shown in Table 9.10.

#### **Prime 43**

We have *P* = ! 1 16 16 42 " and *T* = ! 0 42 1 0 " , *P*, *T* ∈ PSL2(43), and the permutations of order 3, 7, 11 and 43 are generated by ! 0 1 42 1 " , ! 0 1 42 8 " , ! 0 1 42 4 " and ! 0 1 42 41 " , respectively. In addition,

$$\text{PSL}\_2(43) = 2^2 \cdot 3 \cdot 7 \cdot 11 \cdot 43 \cdot = 39732$$

28 0 4 0 0 760 6 21280 32 1 6 0 0 285 0 285 40 1 1 1 1 1 0 1


**Table 9.10** Modular congruence weight distributions of *B*<sup>19</sup>

<sup>a</sup>*ni*<sup>=</sup> *Ai*−*Ai*(*<sup>H</sup>* ) 3420

and the weight enumerator polynomials of the invariant subcodes are

*A* (*G*<sup>0</sup> 2) *<sup>B</sup>*<sup>43</sup> (*z*) <sup>=</sup> <sup>1</sup> <sup>+</sup> *<sup>z</sup>*88 + 44 · *<sup>z</sup>*<sup>16</sup> <sup>+</sup> *<sup>z</sup>*72 + 1232 · *<sup>z</sup>*<sup>20</sup> <sup>+</sup> *<sup>z</sup>*68 + 10241 · *<sup>z</sup>*<sup>24</sup> <sup>+</sup> *<sup>z</sup>*64 + 54560 · *<sup>z</sup>*<sup>28</sup> <sup>+</sup> *<sup>z</sup>*60 + 198374 · *<sup>z</sup>*<sup>32</sup> <sup>+</sup> *<sup>z</sup>*56 + 491568 · *<sup>z</sup>*<sup>36</sup> <sup>+</sup> *<sup>z</sup>*52 + 839916 · *<sup>z</sup>*<sup>40</sup> <sup>+</sup> *<sup>z</sup>*48 <sup>+</sup> <sup>1002432</sup> · *<sup>z</sup>*<sup>44</sup> *A*(*G*4) *<sup>B</sup>*<sup>43</sup> (*z*) <sup>=</sup> <sup>1</sup> <sup>+</sup> *<sup>z</sup>*88 + 32 · *<sup>z</sup>*<sup>20</sup> <sup>+</sup> *<sup>z</sup>*68 + 77 · *<sup>z</sup>*<sup>24</sup> <sup>+</sup> *<sup>z</sup>*64 + 160 · *<sup>z</sup>*<sup>28</sup> <sup>+</sup> *<sup>z</sup>*60 + 330 · *<sup>z</sup>*<sup>32</sup> <sup>+</sup> *<sup>z</sup>*56 + 480 · *<sup>z</sup>*<sup>36</sup> <sup>+</sup> *<sup>z</sup>*52 + 616 · *<sup>z</sup>*<sup>40</sup> <sup>+</sup> *<sup>z</sup>*48 <sup>+</sup> <sup>704</sup> · *<sup>z</sup>*<sup>44</sup> *A*(*S*3) *<sup>B</sup>*<sup>43</sup> (*z*) <sup>=</sup> <sup>1</sup> <sup>+</sup> *<sup>z</sup>*88 + 7 · *<sup>z</sup>*<sup>16</sup> <sup>+</sup> *<sup>z</sup>*72 + 168 · *<sup>z</sup>*<sup>20</sup> <sup>+</sup> *<sup>z</sup>*68 + 445 · *<sup>z</sup>*<sup>24</sup> <sup>+</sup> *<sup>z</sup>*64 + 1960 · *<sup>z</sup>*<sup>28</sup> <sup>+</sup> *<sup>z</sup>*60 + 4704 · *<sup>z</sup>*<sup>32</sup> <sup>+</sup> *<sup>z</sup>*56 + 7224 · *<sup>z</sup>*<sup>36</sup> <sup>+</sup> *<sup>z</sup>*52 + 10843 · *<sup>z</sup>*<sup>40</sup> <sup>+</sup> *<sup>z</sup>*48 <sup>+</sup> <sup>14832</sup> · *<sup>z</sup>*<sup>44</sup> *A*(*S*7) *<sup>B</sup>*<sup>43</sup> (*z*) <sup>=</sup> <sup>1</sup> <sup>+</sup> *<sup>z</sup>*88 + 6 · *<sup>z</sup>*<sup>16</sup> <sup>+</sup> *<sup>z</sup>*72 + 16 · *<sup>z</sup>*<sup>24</sup> <sup>+</sup> *<sup>z</sup>*64 + 6 · *<sup>z</sup>*<sup>28</sup> <sup>+</sup> *<sup>z</sup>*60 + 9 · *<sup>z</sup>*<sup>32</sup> <sup>+</sup> *<sup>z</sup>*56 + 48 · *<sup>z</sup>*<sup>36</sup> <sup>+</sup> *<sup>z</sup>*52 <sup>+</sup> <sup>84</sup> · *<sup>z</sup>*<sup>44</sup> *A*(*S*11) *<sup>B</sup>*<sup>43</sup> (*z*) <sup>=</sup> <sup>1</sup> <sup>+</sup> *<sup>z</sup>*88 <sup>+</sup> <sup>14</sup> · *<sup>z</sup>*<sup>44</sup> *A*(*S*43) *<sup>B</sup>*<sup>43</sup> (*z*) <sup>=</sup> <sup>1</sup> <sup>+</sup> *<sup>z</sup>*88 <sup>+</sup> <sup>2</sup> · *<sup>z</sup>*44.

The weight distributions of *B*<sup>43</sup> and their modular congruence are shown in Table 9.11.

#### **Prime 59**

We have *P* = ! 1 23 23 58 " and *T* = ! 0 58 1 0 " , *P*, *T* ∈ PSL2(59), and the permutations of order 3, 5, 29 and 59 are generated by ! 0 1 58 1 " , ! 0 1 58 25 " , ! 0 1 58 3 " and ! 0 1 58 57 " , respectively. In addition,

$$\text{PSL}\_2(\text{59}) = 2^2 \cdot 3 \cdot \text{5 } \cdot 29 \cdot \text{59} \cdot = 102660$$

and the weight enumerator polynomials of the invariant subcodes are

$$\begin{split} A\_{\frac{29}{36}\text{sg}}(z) &= \left(1+z^{120}\right) + 90 \cdot \left(z^{20}+z^{100}\right) + 2559 \cdot \left(z^{24}+z^{96}\right) + \\ 32700 \cdot \left(z^{28}+z^{92}\right) &+ 278865 \cdot \left(z^{32}+z^{88}\right) + 1721810 \cdot \left(z^{36}+z^{84}\right) + \\ 7807800 \cdot \left(z^{40}+z^{80}\right) &+ 26366160 \cdot \left(z^{44}+z^{76}\right) + 67152520 \cdot \left(z^{48}+z^{72}\right) + \\ 130171860 \cdot \left(z^{52}+z^{68}\right) &+ 193193715 \cdot \left(z^{56}+z^{64}\right) + 220285672 \cdot z^{60} \end{split}$$


 =

39732

$$A\_{\beta\beta g}^{(Ga)}(z) = \left(1 + z^{120}\right) + 6 \cdot \left(z^{20} + z^{100}\right) + 19 \cdot \left(z^{24} + z^{96}\right) + 132 \cdot \left(z^{28} + z^{92}\right) +$$

$$393 \cdot \left(z^{32} + z^{88}\right) + 878 \cdot \left(z^{36} + z^{84}\right) + 1848 \cdot \left(z^{40} + z^{80}\right) + 3312 \cdot \left(z^{44} + z^{76}\right) + 1$$

$$5192 \cdot \left(z^{48} + z^{72}\right) + 7308 \cdot \left(z^{52} + z^{68}\right) + 8931 \cdot \left(z^{56} + z^{64}\right) + 9496 \cdot z^{60}$$

$$A\_{\beta\beta g}^{(S)}(z) = \left(1 + z^{120}\right) + 285 \cdot \left(z^{24} + z^{96}\right) + 21280 \cdot \left(z^{36} + z^{84}\right) +$$

$$239970 \cdot \left(z^{48} + z^{72}\right) + 821504 \cdot z^{60}$$

$$A\_{\beta\beta g}^{(S)}(z) = \left(1 + z^{120}\right) + 12 \cdot \left(z^{20} + z^{100}\right) + 711 \cdot \left(z^{40} + z^{80}\right) + 2648 \cdot z^{60}$$

$$A\_{\beta\beta g}^{(S)}(z) = \left(1 + z^{120}\right) + 4 \cdot \left(z^{32} + z^{88}\right) + 6 \cdot z^{60}$$

$$A\_{\beta\beta g}^{(S)}(z) = \left(1 + z^{120}\right) + 2 \cdot z^{60}$$

The weight distributions of *B*<sup>59</sup> and their modular congruence are shown in Table 9.12.

#### **Prime 67**

We have *P* = ! 1 20 20 66 " and *T* = ! 0 66 1 0 " , *P*, *T* ∈ PSL2(67), and the permutations of order 3, 11, 17 and 67 are generated by ! 0 1 66 1 " , ! 0 1 66 17 " , ! 0 1 66 4 " and ! 0 1 66 65 " , respectively. In addition,

$$\text{PSL}\_2(67) = 2^2 \cdot 3 \cdot 11 \cdot 17 \cdot 67 \cdot = 150348$$

and the weight enumerator polynomials of the invariant subcodes are

*A*(*G*<sup>0</sup> 2) *<sup>B</sup>*<sup>67</sup> (*z*) <sup>=</sup> <sup>1</sup> <sup>+</sup> *<sup>z</sup>*136 + 578 · *<sup>z</sup>*<sup>24</sup> <sup>+</sup> *<sup>z</sup>*112 + 14688 · *<sup>z</sup>*<sup>28</sup> <sup>+</sup> *<sup>z</sup>*108 + 173247 · *<sup>z</sup>*<sup>32</sup> <sup>+</sup> *<sup>z</sup>*104 + 1480768 · *<sup>z</sup>*<sup>36</sup> <sup>+</sup> *<sup>z</sup>*100 + 9551297 · *<sup>z</sup>*<sup>40</sup> <sup>+</sup> *<sup>z</sup>*96 + 46687712 · *<sup>z</sup>*<sup>44</sup> <sup>+</sup> *<sup>z</sup>*92 + 175068210 · *<sup>z</sup>*<sup>48</sup> <sup>+</sup> *<sup>z</sup>*88 + 509510400 · *<sup>z</sup>*<sup>52</sup> <sup>+</sup> *<sup>z</sup>*84 + 1160576876 · *<sup>z</sup>*<sup>56</sup> <sup>+</sup> *<sup>z</sup>*80 + 2081112256 · *<sup>z</sup>*<sup>60</sup> <sup>+</sup> *<sup>z</sup>*76 + 2949597087 · *<sup>z</sup>*<sup>64</sup> <sup>+</sup> *<sup>z</sup>*72 + <sup>3312322944</sup> · *<sup>z</sup>*<sup>68</sup> *A*(*G*4) *<sup>B</sup>*<sup>67</sup> (*z*) <sup>=</sup> <sup>1</sup> <sup>+</sup> *<sup>z</sup>*136 + 18 · *<sup>z</sup>*<sup>24</sup> <sup>+</sup> *<sup>z</sup>*112 + 88 · *<sup>z</sup>*<sup>28</sup> <sup>+</sup> *<sup>z</sup>*108 + 271 · *<sup>z</sup>*<sup>32</sup> <sup>+</sup> *<sup>z</sup>*104 + 816 · *<sup>z</sup>*<sup>36</sup> <sup>+</sup> *<sup>z</sup>*100 + 2001 · *<sup>z</sup>*<sup>40</sup> <sup>+</sup> *<sup>z</sup>*96 + 4344 · *<sup>z</sup>*<sup>44</sup> <sup>+</sup> *<sup>z</sup>*92 + 8386 · *<sup>z</sup>*<sup>48</sup> <sup>+</sup> *<sup>z</sup>*88 + 14144 · *<sup>z</sup>*<sup>52</sup> <sup>+</sup> *<sup>z</sup>*84 + 21260 · *<sup>z</sup>*<sup>56</sup> <sup>+</sup> *<sup>z</sup>*80 + 28336 · *<sup>z</sup>*<sup>60</sup> <sup>+</sup> *<sup>z</sup>*76 + 33599 · *<sup>z</sup>*<sup>64</sup> <sup>+</sup> *<sup>z</sup>*72 <sup>+</sup> <sup>35616</sup> · *<sup>z</sup>*<sup>68</sup> *A*(*S*3) *<sup>B</sup>*<sup>67</sup> (*z*) <sup>=</sup> <sup>1</sup> <sup>+</sup> *<sup>z</sup>*136 + 66 · *<sup>z</sup>*<sup>24</sup> <sup>+</sup> *<sup>z</sup>*112 + 682 · *<sup>z</sup>*<sup>28</sup> <sup>+</sup> *<sup>z</sup>*108 + 3696 · *<sup>z</sup>*<sup>32</sup> <sup>+</sup> *<sup>z</sup>*104 + 12390 · *<sup>z</sup>*<sup>36</sup> <sup>+</sup> *<sup>z</sup>*100 + 54747 · *<sup>z</sup>*<sup>40</sup> <sup>+</sup> *<sup>z</sup>*96 + 163680 · *<sup>z</sup>*<sup>44</sup> <sup>+</sup> *<sup>z</sup>*92 + 318516 · *<sup>z</sup>*<sup>48</sup> <sup>+</sup> *<sup>z</sup>*88 + 753522 · *<sup>z</sup>*<sup>52</sup> <sup>+</sup> *<sup>z</sup>*84 + 1474704 · *<sup>z</sup>*<sup>56</sup> <sup>+</sup> *<sup>z</sup>*80 + 1763454 · *<sup>z</sup>*<sup>60</sup> <sup>+</sup> *<sup>z</sup>*76 + 2339502 · *<sup>z</sup>*<sup>64</sup> <sup>+</sup> *<sup>z</sup>*72 <sup>+</sup> <sup>3007296</sup> · *<sup>z</sup>*<sup>68</sup>


**9.12**Modularcongruenceweightdistributionsof

a*ni* =

 −

102660

$$\begin{aligned} A\_{\mathcal{\mathcal{B}}\_{6\uparrow}}^{(S\_{11})}(z) &= \left(1 + z^{136}\right) + 6 \cdot \left(z^{24} + z^{112}\right) + 16 \cdot \left(z^{36} + z^{100}\right) + 6 \cdot \left(z^{44} + z^{92}\right) + 3\\ &9 \cdot \left(z^{48} + z^{88}\right) + 48 \cdot \left(z^{56} + z^{80}\right) + 84 \cdot z^{68} \\ A\_{\mathcal{\mathcal{B}}\_{6\uparrow}}^{(S\_{17})}(z) &= \left(1 + z^{136}\right) + 14 \cdot z^{68} \\ A\_{\mathcal{\mathcal{B}}\_{6\uparrow}}^{(S\_{67})}(z) &= \left(1 + z^{136}\right) + 2 \cdot z^{68} \end{aligned}$$

The weight distributions of *B*<sup>67</sup> and their modular congruence are shown in Table 9.13.

#### **Prime 83**

We have *P* = ! 1 9 9 82 " and *T* = ! 0 82 1 0 " , *P*, *T* ∈ PSL2(83), and the permutations of order 3, 7, 41 and 83 are generated by ! 0 1 82 1 " , ! 0 1 82 10 " , ! 0 1 82 4 " and ! 0 1 82 81 " , respectively. In addition,

$$\text{PSL}\_2(83) = 2^2 \cdot 3 \cdot 7 \cdot 41 \cdot 83 \cdot = 285852$$

and the weight enumerator polynomials of the invariant subcodes are

*A*(*G*<sup>0</sup> 2) *<sup>B</sup>*<sup>83</sup> (*z*) = <sup>1</sup> <sup>+</sup> *<sup>z</sup>*168 + 196 · *<sup>z</sup>*<sup>24</sup> <sup>+</sup> *<sup>z</sup>*144 + 1050 · *<sup>z</sup>*<sup>28</sup> <sup>+</sup> *<sup>z</sup>*140 + 29232 · *<sup>z</sup>*<sup>32</sup> <sup>+</sup> *<sup>z</sup>*136 + 443156 · *<sup>z</sup>*<sup>36</sup> <sup>+</sup> *<sup>z</sup>*132 + 4866477 · *<sup>z</sup>*<sup>40</sup> <sup>+</sup> *<sup>z</sup>*128 + 42512190 · *<sup>z</sup>*<sup>44</sup> <sup>+</sup> *<sup>z</sup>*124 + 292033644 · *<sup>z</sup>*<sup>48</sup> <sup>+</sup> *<sup>z</sup>*120 + 1590338568 · *<sup>z</sup>*<sup>52</sup> <sup>+</sup> *<sup>z</sup>*116 + 6952198884 · *<sup>z</sup>*<sup>56</sup> <sup>+</sup> *<sup>z</sup>*112 + 24612232106 · *<sup>z</sup>*<sup>60</sup> <sup>+</sup> *<sup>z</sup>*108 + 71013075210 · *<sup>z</sup>*<sup>64</sup> <sup>+</sup> *<sup>z</sup>*104 + 167850453036 · *<sup>z</sup>*<sup>68</sup> <sup>+</sup> *<sup>z</sup>*100 + 326369180312 · *<sup>z</sup>*<sup>72</sup> <sup>+</sup> *<sup>z</sup>*96 + 523672883454 · *<sup>z</sup>*<sup>76</sup> <sup>+</sup> *<sup>z</sup>*92 + 694880243820 · *<sup>z</sup>*<sup>80</sup> <sup>+</sup> *<sup>z</sup>*88 <sup>+</sup> <sup>763485528432</sup> · *<sup>z</sup>*<sup>84</sup> *A*(*G*4) *<sup>B</sup>*<sup>83</sup> (*z*) = <sup>1</sup> <sup>+</sup> *<sup>z</sup>*168 + 4 · *<sup>z</sup>*<sup>24</sup> <sup>+</sup> *<sup>z</sup>*144 + 6 · *<sup>z</sup>*<sup>28</sup> <sup>+</sup> *<sup>z</sup>*140 + 96 · *<sup>z</sup>*<sup>32</sup> <sup>+</sup> *<sup>z</sup>*136 + 532 · *<sup>z</sup>*<sup>36</sup> <sup>+</sup> *<sup>z</sup>*132 + 1437 · *<sup>z</sup>*<sup>40</sup> <sup>+</sup> *<sup>z</sup>*128 + 3810 · *<sup>z</sup>*<sup>44</sup> <sup>+</sup> *<sup>z</sup>*124 + 10572 · *<sup>z</sup>*<sup>48</sup> <sup>+</sup> *<sup>z</sup>*120 + 24456 · *<sup>z</sup>*<sup>52</sup> <sup>+</sup> *<sup>z</sup>*116 + 50244 · *<sup>z</sup>*<sup>56</sup> <sup>+</sup> *<sup>z</sup>*112 + 95030 · *<sup>z</sup>*<sup>60</sup> <sup>+</sup> *<sup>z</sup>*108 + 158874 · *<sup>z</sup>*<sup>64</sup> <sup>+</sup> *<sup>z</sup>*104 + 241452 · *<sup>z</sup>*<sup>68</sup> <sup>+</sup> *<sup>z</sup>*100 + 337640 · *<sup>z</sup>*<sup>72</sup> <sup>+</sup> *<sup>z</sup>*96 + 425442 · *<sup>z</sup>*<sup>76</sup> <sup>+</sup> *<sup>z</sup>*92 + 489708 · *<sup>z</sup>*<sup>80</sup> <sup>+</sup> *<sup>z</sup>*88 <sup>+</sup> <sup>515696</sup> · *<sup>z</sup>*<sup>84</sup>


**9.13**Modularcongruenceweightdistributionsof

a*ni* = *A* −*i Ai*(*H* )

150348

$$\begin{split} A\_{\mathcal{AB}\_{83}}^{(S\_5)}(z) &= \left(1 + z^{168}\right) + 63 \cdot \left(z^{24} + z^{144}\right) + 8568 \cdot \left(z^{36} + z^{132}\right) + 617085 \cdot \left(z^{48} + z^{120}\right) + 34\\ &11720352 \cdot \left(z^{60} + z^{108}\right) + 64866627 \cdot \left(z^{72} + z^{96}\right) + 114010064 \cdot z^{84} \\ A\_{\mathcal{AB}\_{83}}^{(S\_7)}(z) &= \left(1 + z^{168}\right) + 789 \cdot \left(z^{56} + z^{112}\right) + 2576 \cdot z^{84} \\ A\_{\mathcal{AB}\_{83}}^{(S\_{10})}(z) &= \left(1 + z^{168}\right) + 4 \cdot \left(z^{44} + z^{124}\right) + 6 \cdot z^{84} \\ A\_{\mathcal{AB}\_{83}}^{(S\_{30})}(z) &= \left(1 + z^{168}\right) + 2 \cdot z^{84} .\end{split}$$

The weight distributions of *B*<sup>83</sup> and their modular congruence are shown in Table 9.14.

# *Primes* **−***3 Modulo 8*

#### **Prime 13**

We have *P* = ! 3 4 4 10 " and *T* = ! 0 12 1 0 " , *P*, *T* ∈ PSL2(13), and the permutations of order 3, 7 and 13 are generated by ! 0 1 12 1 " , ! 0 1 12 3 " and ! 0 1 12 11 " , respectively. In addition,

$$\text{PSL}\_2(1\text{3}) = 2^2 \cdot 3 \cdot 7 \cdot 13 \cdot = 1092$$

and the weight enumerator polynomials of the invariant subcodes are

$$\begin{split} &A^{G\_{2}^{G}}\_{\beta\bar{\mathcal{B}}\_{13}}(z) = \left(1+z^{28}\right) + 26\cdot\left(z^{8}+z^{20}\right) + 32\cdot\left(z^{10}+z^{18}\right) + 37\cdot\left(z^{12}+z^{16}\right) + 64\cdot z^{14} \\ &A^{G\_{31}^{G}}\_{\beta\bar{\mathcal{B}}\_{13}}(z) = \left(1+z^{28}\right) + 10\cdot\left(z^{8}+z^{20}\right) + 8\cdot\left(z^{10}+z^{18}\right) + 5\cdot\left(z^{12}+z^{16}\right) + 16\cdot z^{14} \\ &A^{S\_{\beta13}^{S}}\_{\beta\bar{\mathcal{B}}\_{13}}(z) = \left(1+z^{28}\right) + 6\cdot\left(z^{8}+z^{20}\right) + 10\cdot\left(z^{10}+z^{18}\right) + 9\cdot\left(z^{12}+z^{16}\right) + 12\cdot z^{14} \\ &A^{S\_{\beta1}^{S}}\_{\beta\bar{\mathcal{B}}\_{13}}(z) = \left(1+z^{28}\right) + 2\cdot z^{14} \\ &A^{S\_{\beta1}^{S}}\_{\beta\bar{\mathcal{B}}\_{13}}(z) = \left(1+z^{28}\right) + 2\cdot z^{14} .\end{split}$$

The weight distributions of *B*<sup>13</sup> and their modular congruence are shown in Table 9.15.

#### **Prime 29**

We have *P* = ! 2 13 13 27 " and *T* = ! 0 28 1 0 " , *P*, *T* ∈ PSL2(29), and the permutations of order 3, 5, 7 and 29 are generated by ! 0 1 28 1 " , ! 0 1 28 5 " , ! 0 1 28 3 " and ! 0 1 28 27 " , respectively. In addition,

$$\text{PSL}\_2(29) = 2^2 \cdot 3 \cdot 5 \cdot 7 \cdot 29 \cdot = 12180$$



Weight Distributions of Quadratic Double-Circulant … 279


**Table 9.15** Modular congruence weight distributions of *B*<sup>13</sup>

<sup>a</sup>*ni* <sup>=</sup> *Ai*−*Ai*(*<sup>H</sup>* ) 1092

#### and the weight enumerator polynomials of the invariant subcodes are

*A* (*G*<sup>0</sup> 2) *<sup>B</sup>*<sup>29</sup> (*z*) <sup>=</sup> <sup>1</sup> <sup>+</sup> *<sup>z</sup>*60 + 28 · *<sup>z</sup>*<sup>12</sup> <sup>+</sup> *<sup>z</sup>*48 + 112 · *<sup>z</sup>*<sup>14</sup> <sup>+</sup> *<sup>z</sup>*46 + 394 · *<sup>z</sup>*<sup>16</sup> <sup>+</sup> *<sup>z</sup>*44 + 1024 · *<sup>z</sup>*<sup>18</sup> <sup>+</sup> *<sup>z</sup>*42 + 1708 · *<sup>z</sup>*<sup>20</sup> <sup>+</sup> *<sup>z</sup>*40 + 3136 · *<sup>z</sup>*<sup>22</sup> <sup>+</sup> *<sup>z</sup>*38 + 5516 · *<sup>z</sup>*<sup>24</sup> <sup>+</sup> *<sup>z</sup>*36 + 7168 · *<sup>z</sup>*<sup>26</sup> <sup>+</sup> *<sup>z</sup>*34 + 8737 · *<sup>z</sup>*<sup>28</sup> <sup>+</sup> *<sup>z</sup>*32 <sup>+</sup> <sup>9888</sup> · *<sup>z</sup>*<sup>30</sup> *<sup>A</sup>*(*G*4) *<sup>B</sup>*<sup>29</sup> (*z*) <sup>=</sup> <sup>1</sup> <sup>+</sup> *<sup>z</sup>*60 + 12 · *<sup>z</sup>*<sup>14</sup> <sup>+</sup> *<sup>z</sup>*46 + 30 · *<sup>z</sup>*<sup>16</sup> <sup>+</sup> *<sup>z</sup>*44 + 32 · *<sup>z</sup>*<sup>18</sup> <sup>+</sup> *<sup>z</sup>*42 + 60 · *<sup>z</sup>*<sup>20</sup> <sup>+</sup> *<sup>z</sup>*40 + 48 · *<sup>z</sup>*<sup>22</sup> <sup>+</sup> *<sup>z</sup>*38 + 60 · *<sup>z</sup>*<sup>24</sup> <sup>+</sup> *<sup>z</sup>*36 + 96 · *<sup>z</sup>*<sup>26</sup> <sup>+</sup> *<sup>z</sup>*34 + 105 · *<sup>z</sup>*<sup>28</sup> <sup>+</sup> *<sup>z</sup>*32 <sup>+</sup> <sup>136</sup> · *<sup>z</sup>*<sup>30</sup> *<sup>A</sup>*(*S*3) *<sup>B</sup>*<sup>29</sup> (*z*) <sup>=</sup> <sup>1</sup> <sup>+</sup> *<sup>z</sup>*60 + 10 · *<sup>z</sup>*<sup>12</sup> <sup>+</sup> *<sup>z</sup>*48 + 70 · *<sup>z</sup>*<sup>18</sup> <sup>+</sup> *<sup>z</sup>*42 + 245 · *<sup>z</sup>*<sup>24</sup> <sup>+</sup> *<sup>z</sup>*36 <sup>+</sup> <sup>372</sup> · *<sup>z</sup>*<sup>30</sup> *<sup>A</sup>*(*S*5) *<sup>B</sup>*<sup>29</sup> (*z*) <sup>=</sup> <sup>1</sup> <sup>+</sup> *<sup>z</sup>*60 + 15 · *<sup>z</sup>*<sup>20</sup> <sup>+</sup> *<sup>z</sup>*40 <sup>+</sup> <sup>32</sup> · *<sup>z</sup>*<sup>30</sup> *<sup>A</sup>*(*S*7) *<sup>B</sup>*<sup>29</sup> (*z*) <sup>=</sup> <sup>1</sup> <sup>+</sup> *<sup>z</sup>*60 + 6 · *<sup>z</sup>*<sup>16</sup> <sup>+</sup> *<sup>z</sup>*44 + 2 · *<sup>z</sup>*<sup>18</sup> <sup>+</sup> *<sup>z</sup>*42 + 8 · *<sup>z</sup>*<sup>22</sup> <sup>+</sup> *<sup>z</sup>*38 + 8 · *<sup>z</sup>*<sup>24</sup> <sup>+</sup> *<sup>z</sup>*36 + 1 · *<sup>z</sup>*<sup>28</sup> <sup>+</sup> *<sup>z</sup>*32 <sup>+</sup> <sup>12</sup> · *<sup>z</sup>*<sup>30</sup> *<sup>A</sup>*(*S*29) *<sup>B</sup>*<sup>29</sup> (*z*) <sup>=</sup> <sup>1</sup> <sup>+</sup> *<sup>z</sup>*60 <sup>+</sup> <sup>2</sup> · *<sup>z</sup>*30.

The weight distributions of *B*<sup>29</sup> and their modular congruence are shown in Table 9.16.

#### **Prime 53**

We have *P* = ! 3 19 19 50 " and *T* = ! 0 52 1 0 " , *P*, *T* ∈ PSL2(53), and the permutations of order 3, 13 and 53 are generated by ! 0 1 52 1 " , ! 0 1 52 8 " and ! 0 1 52 51 " , respectively. In addition,

$$\text{PSL}\_2(\textbf{S3}) = 2^2 \cdot 3^3 \cdot 13 \cdot \textbf{S3} \cdot = 74412$$


**Table 9.16** Modular congruence weight distributions of *B*<sup>29</sup>

$$\mathbf{a}\_{n\_i} = \frac{A\_i - A\_i \left(\mathcal{H}^\ell\right)}{\max \ell}$$

12180

#### and the weight enumerator polynomials of the invariant subcodes are

*A*(*G*<sup>0</sup> 2) *<sup>B</sup>*<sup>53</sup> (*z*) <sup>=</sup> <sup>1</sup> <sup>+</sup> *<sup>z</sup>*108 + 234 · *<sup>z</sup>*<sup>20</sup> <sup>+</sup> *<sup>z</sup>*88 + 1768 · *<sup>z</sup>*<sup>22</sup> <sup>+</sup> *<sup>z</sup>*86 + 5655 · *<sup>z</sup>*<sup>24</sup> <sup>+</sup> *<sup>z</sup>*84 + 16328 · *<sup>z</sup>*<sup>26</sup> <sup>+</sup> *<sup>z</sup>*82 + 47335 · *<sup>z</sup>*<sup>28</sup> <sup>+</sup> *<sup>z</sup>*80 + 127896 · *<sup>z</sup>*<sup>30</sup> <sup>+</sup> *<sup>z</sup>*78 + 316043 · *<sup>z</sup>*<sup>32</sup> <sup>+</sup> *<sup>z</sup>*76 + 705848 · *<sup>z</sup>*<sup>34</sup> <sup>+</sup> *<sup>z</sup>*74 + 1442883 · *<sup>z</sup>*<sup>36</sup> <sup>+</sup> *<sup>z</sup>*72 + 2728336 · *<sup>z</sup>*<sup>38</sup> <sup>+</sup> *<sup>z</sup>*70 + 4786873 · *<sup>z</sup>*<sup>40</sup> <sup>+</sup> *<sup>z</sup>*68 + 7768488 · *<sup>z</sup>*<sup>42</sup> <sup>+</sup> *<sup>z</sup>*66 + 11636144 · *<sup>z</sup>*<sup>44</sup> <sup>+</sup> *<sup>z</sup>*64 + 16175848 · *<sup>z</sup>*<sup>46</sup> <sup>+</sup> *<sup>z</sup>*62 + 20897565 · *<sup>z</sup>*<sup>48</sup> <sup>+</sup> *<sup>z</sup>*60 + 25055576 · *<sup>z</sup>*<sup>50</sup> <sup>+</sup> *<sup>z</sup>*58 + 27976131 · *<sup>z</sup>*<sup>52</sup> <sup>+</sup> *<sup>z</sup>*56 <sup>+</sup> <sup>29057552</sup> · *<sup>z</sup>*<sup>54</sup> *A*(*G*4) *<sup>B</sup>*<sup>53</sup> (*z*) <sup>=</sup> <sup>1</sup> <sup>+</sup> *<sup>z</sup>*108 + 12 · *<sup>z</sup>*<sup>20</sup> <sup>+</sup> *<sup>z</sup>*88 + 12 · *<sup>z</sup>*<sup>22</sup> <sup>+</sup> *<sup>z</sup>*86 + 77 · *<sup>z</sup>*<sup>24</sup> <sup>+</sup> *<sup>z</sup>*84 + 108 · *<sup>z</sup>*<sup>26</sup> <sup>+</sup> *<sup>z</sup>*82 + 243 · *<sup>z</sup>*<sup>28</sup> <sup>+</sup> *<sup>z</sup>*80 + 296 · *<sup>z</sup>*<sup>30</sup> <sup>+</sup> *<sup>z</sup>*78 + 543 · *<sup>z</sup>*<sup>32</sup> <sup>+</sup> *<sup>z</sup>*76 + 612 · *<sup>z</sup>*<sup>34</sup> <sup>+</sup> *<sup>z</sup>*74 + 1127 · *<sup>z</sup>*<sup>36</sup> <sup>+</sup> *<sup>z</sup>*72 + 1440 · *<sup>z</sup>*<sup>38</sup> <sup>+</sup> *<sup>z</sup>*70 + 2037 · *<sup>z</sup>*<sup>40</sup> <sup>+</sup> *<sup>z</sup>*68 + 2636 · *<sup>z</sup>*<sup>42</sup> <sup>+</sup> *<sup>z</sup>*66 + 3180 · *<sup>z</sup>*<sup>44</sup> <sup>+</sup> *<sup>z</sup>*64 + 3672 · *<sup>z</sup>*<sup>46</sup> <sup>+</sup> *<sup>z</sup>*62 + 4289 · *<sup>z</sup>*<sup>48</sup> <sup>+</sup> *<sup>z</sup>*60 + 4836 · *<sup>z</sup>*<sup>50</sup> <sup>+</sup> *<sup>z</sup>*58 + 4875 · *<sup>z</sup>*<sup>52</sup> <sup>+</sup> *<sup>z</sup>*56 <sup>+</sup> <sup>5544</sup> · *<sup>z</sup>*<sup>54</sup>


**Table 9.17** Modular congruence weight distributions of *B*<sup>53</sup>

74412


**Table 9.18** Modular congruence weight distributions of *B*<sup>61</sup>


**Table 9.18** (continued)

$$\mathbf{a}\_{m\_i} = \frac{A\_i - A\_i \left(\mathcal{H}^\ell\right)}{113460}$$

$$\begin{split} A\_{\mathcal{\mathcal{\mathcal{\mathcal{B}}}\_{\text{S1}}}^{({\text{S1}})}(z) &= \left(1 + z^{108}\right) + 234 \cdot \left(z^{24} + z^{84}\right) + 1962 \cdot \left(z^{30} + z^{78}\right) + 9672 \cdot \left(z^{36} + z^{72}\right) + \cdots \\ & \quad 28728 \cdot \left(z^{42} + z^{66}\right) + \\$5629 \cdot \left(z^{48} + z^{60}\right) + 69692 \cdot z^{54} \\\ A\_{\mathcal{\mathcal{AB}}\_{\text{S1}}}^{({\text{S1}})}(z) &= \left(1 + z^{108}\right) + 6 \cdot \left(z^{28} + z^{80}\right) + 2 \cdot \left(z^{30} + z^{78}\right) + 8 \cdot \left(z^{40} + z^{68}\right) + \\ & \quad 8 \cdot \left(z^{42} + z^{66}\right) + 1 \cdot \left(z^{52} + z^{56}\right) + 12 \cdot z^{54} \\\ A\_{\mathcal{\mathcal{AB}}\_{\text{S1}}}^{({\text{S1}})}(z) &= \left(1 + z^{108}\right) + 2 \cdot z^{54} .\end{split}$$

The weight distributions of *B*<sup>53</sup> and their modular congruence are shown in Table 9.17.

#### **Prime 61**

We have *P* = ! 2 19 19 59 " and *T* = ! 0 60 1 0 " , *P*, *T* ∈ PSL2(61), and the permutations of order 3, 5, 31 and 61 are generated by ! 0 1 60 1 " , ! 0 1 60 17 " , ! 0 1 60 5 " and ! 0 1 60 59 " , respectively. In addition,

$$\text{PSL}\_2(61) = 2^2 \cdot 3 \cdot 5 \cdot 31 \cdot 61 \cdot = 113460$$

and the weight enumerator polynomials of the invariant subcodes are

*A* (*G*<sup>0</sup> 2) *<sup>B</sup>*<sup>61</sup> = <sup>1</sup> <sup>+</sup> *<sup>z</sup>*124 + 208 · *<sup>z</sup>*<sup>20</sup> <sup>+</sup> *<sup>z</sup>*104 + 400 · *<sup>z</sup>*<sup>22</sup> <sup>+</sup> *<sup>z</sup>*102 + 1930 · *<sup>z</sup>*<sup>24</sup> <sup>+</sup> *<sup>z</sup>*100 + 8180 · *<sup>z</sup>*<sup>26</sup> <sup>+</sup> *<sup>z</sup>*98 + 26430 · *<sup>z</sup>*<sup>28</sup> <sup>+</sup> *<sup>z</sup>*96 + 84936 · *<sup>z</sup>*<sup>30</sup> <sup>+</sup> *<sup>z</sup>*94 + 253572 · *<sup>z</sup>*<sup>32</sup> <sup>+</sup> *<sup>z</sup>*92 + 696468 · *<sup>z</sup>*<sup>34</sup> <sup>+</sup> *<sup>z</sup>*90 + 1725330 · *<sup>z</sup>*<sup>36</sup> <sup>+</sup> *<sup>z</sup>*88 + 3972240 · *<sup>z</sup>*<sup>38</sup> <sup>+</sup> *<sup>z</sup>*86 + 8585008 · *<sup>z</sup>*<sup>40</sup> <sup>+</sup> *<sup>z</sup>*84 + 17159632 · *<sup>z</sup>*<sup>42</sup> <sup>+</sup> *<sup>z</sup>*82 + 31929532 · *<sup>z</sup>*<sup>44</sup> <sup>+</sup> *<sup>z</sup>*80 + 55569120 · *<sup>z</sup>*<sup>46</sup> <sup>+</sup> *<sup>z</sup>*78 + 90336940 · *<sup>z</sup>*<sup>48</sup> <sup>+</sup> *<sup>z</sup>*76 + 137329552 · *<sup>z</sup>*<sup>50</sup> <sup>+</sup> *<sup>z</sup>*74 + 195328240 · *<sup>z</sup>*<sup>52</sup> <sup>+</sup> *<sup>z</sup>*72 + 260435936 · *<sup>z</sup>*<sup>54</sup> <sup>+</sup> *<sup>z</sup>*70 + 325698420 · *<sup>z</sup>*<sup>56</sup> <sup>+</sup> *<sup>z</sup>*68 + 381677080 · *<sup>z</sup>*<sup>58</sup> <sup>+</sup> *<sup>z</sup>*66 + 419856213 · *<sup>z</sup>*<sup>60</sup> <sup>+</sup> *<sup>z</sup>*64 + <sup>433616560</sup> · *<sup>z</sup>*<sup>62</sup> *<sup>A</sup>*(*G*4) *<sup>B</sup>*<sup>61</sup> = <sup>1</sup> <sup>+</sup> *<sup>z</sup>*124 + 12 · *<sup>z</sup>*<sup>20</sup> <sup>+</sup> *<sup>z</sup>*104 + 12 · *<sup>z</sup>*<sup>22</sup> <sup>+</sup> *<sup>z</sup>*102 + 36 · *<sup>z</sup>*<sup>24</sup> <sup>+</sup> *<sup>z</sup>*100 + 40 · *<sup>z</sup>*<sup>26</sup> <sup>+</sup> *<sup>z</sup>*98 + 140 · *<sup>z</sup>*<sup>28</sup> <sup>+</sup> *<sup>z</sup>*96 + 176 · *<sup>z</sup>*<sup>30</sup> <sup>+</sup> *<sup>z</sup>*94 + 498 · *<sup>z</sup>*<sup>32</sup> <sup>+</sup> *<sup>z</sup>*92 + 576 · *<sup>z</sup>*<sup>34</sup> <sup>+</sup> *<sup>z</sup>*90 + 1340 · *<sup>z</sup>*<sup>36</sup> <sup>+</sup> *<sup>z</sup>*88 + 1580 · *<sup>z</sup>*<sup>38</sup> <sup>+</sup> *<sup>z</sup>*86 + 2660 · *<sup>z</sup>*<sup>40</sup> <sup>+</sup> *<sup>z</sup>*84 + 3432 · *<sup>z</sup>*<sup>42</sup> <sup>+</sup> *<sup>z</sup>*82 + 4932 · *<sup>z</sup>*<sup>44</sup> <sup>+</sup> *<sup>z</sup>*80 + 6368 · *<sup>z</sup>*<sup>46</sup> <sup>+</sup> *<sup>z</sup>*78 + 8820 · *<sup>z</sup>*<sup>48</sup> <sup>+</sup> *<sup>z</sup>*76 + 10424 · *<sup>z</sup>*<sup>50</sup> <sup>+</sup> *<sup>z</sup>*74 + 12752 · *<sup>z</sup>*<sup>52</sup> <sup>+</sup> *<sup>z</sup>*72 + 14536 · *<sup>z</sup>*<sup>54</sup> <sup>+</sup> *<sup>z</sup>*70 + 15840 · *<sup>z</sup>*<sup>56</sup> <sup>+</sup> *<sup>z</sup>*68 + 18296 · *<sup>z</sup>*<sup>58</sup> <sup>+</sup> *<sup>z</sup>*66 + 18505 · *<sup>z</sup>*<sup>60</sup> <sup>+</sup> *<sup>z</sup>*64 <sup>+</sup> <sup>20192</sup> · *<sup>z</sup>*<sup>62</sup> *<sup>A</sup>*(*S*3) *<sup>B</sup>*<sup>61</sup> = <sup>1</sup> <sup>+</sup> *<sup>z</sup>*124 + 30 · *<sup>z</sup>*<sup>20</sup> <sup>+</sup> *<sup>z</sup>*104 + 10 · *<sup>z</sup>*<sup>22</sup> <sup>+</sup> *<sup>z</sup>*102 + 50 · *<sup>z</sup>*<sup>24</sup> <sup>+</sup> *<sup>z</sup>*100 + 200 · *<sup>z</sup>*<sup>26</sup> <sup>+</sup> *<sup>z</sup>*98 + 620 · *<sup>z</sup>*<sup>28</sup> <sup>+</sup> *<sup>z</sup>*96 + 960 · *<sup>z</sup>*<sup>30</sup> <sup>+</sup> *<sup>z</sup>*94 + 2416 · *<sup>z</sup>*<sup>32</sup> <sup>+</sup> *<sup>z</sup>*92 + 4992 · *<sup>z</sup>*<sup>34</sup> <sup>+</sup> *<sup>z</sup>*90 + 6945 · *<sup>z</sup>*<sup>36</sup> <sup>+</sup> *<sup>z</sup>*88 + 15340 · *<sup>z</sup>*<sup>38</sup> <sup>+</sup> *<sup>z</sup>*86 + 25085 · *<sup>z</sup>*<sup>40</sup> <sup>+</sup> *<sup>z</sup>*84 + 34920 · *<sup>z</sup>*<sup>42</sup> <sup>+</sup> *<sup>z</sup>*82 + 68700 · *<sup>z</sup>*<sup>44</sup> <sup>+</sup> *<sup>z</sup>*80 + 87548 · *<sup>z</sup>*<sup>46</sup> <sup>+</sup> *<sup>z</sup>*78 + 104513 · *<sup>z</sup>*<sup>48</sup> <sup>+</sup> *<sup>z</sup>*76 + 177800 · *<sup>z</sup>*<sup>50</sup> <sup>+</sup> *<sup>z</sup>*74 + 201440 · *<sup>z</sup>*<sup>52</sup> <sup>+</sup> *<sup>z</sup>*72 + 225290 · *<sup>z</sup>*<sup>54</sup> <sup>+</sup> *<sup>z</sup>*70 + 322070 · *<sup>z</sup>*<sup>56</sup> <sup>+</sup> *<sup>z</sup>*68 + 301640 · *<sup>z</sup>*<sup>58</sup> <sup>+</sup> *<sup>z</sup>*66 + 316706 · *<sup>z</sup>*<sup>60</sup> <sup>+</sup> *<sup>z</sup>*64 + <sup>399752</sup> · *<sup>z</sup>*<sup>62</sup> *<sup>A</sup>*(*S*5) *<sup>B</sup>*<sup>61</sup> = <sup>1</sup> <sup>+</sup> *<sup>z</sup>*124 + 3 · *<sup>z</sup>*<sup>20</sup> <sup>+</sup> *<sup>z</sup>*104 + 24 · *<sup>z</sup>*<sup>26</sup> <sup>+</sup> *<sup>z</sup>*98 + 48 · *<sup>z</sup>*<sup>28</sup> <sup>+</sup> *<sup>z</sup>*96 + 6 · *<sup>z</sup>*<sup>30</sup> <sup>+</sup> *<sup>z</sup>*94 + 150 · *<sup>z</sup>*<sup>32</sup> <sup>+</sup> *<sup>z</sup>*92 + 8 · *<sup>z</sup>*<sup>34</sup> <sup>+</sup> *<sup>z</sup>*90 + 168 · *<sup>z</sup>*<sup>36</sup> <sup>+</sup> *<sup>z</sup>*88 + 96 · *<sup>z</sup>*<sup>38</sup> <sup>+</sup> *<sup>z</sup>*86 + 75 · *<sup>z</sup>*<sup>40</sup> <sup>+</sup> *<sup>z</sup>*84 + 468 · *<sup>z</sup>*<sup>42</sup> <sup>+</sup> *<sup>z</sup>*82 + 132 · *<sup>z</sup>*<sup>44</sup> <sup>+</sup> *<sup>z</sup>*80 + 656 · *<sup>z</sup>*<sup>46</sup> <sup>+</sup> *<sup>z</sup>*78 + 680 · *<sup>z</sup>*<sup>48</sup> <sup>+</sup> *<sup>z</sup>*76 + 300 · *<sup>z</sup>*<sup>50</sup> <sup>+</sup> *<sup>z</sup>*74 + 1386 · *<sup>z</sup>*<sup>52</sup> <sup>+</sup> *<sup>z</sup>*72 + 198 · *<sup>z</sup>*<sup>54</sup> <sup>+</sup> *<sup>z</sup>*70 + 1152 · *<sup>z</sup>*<sup>56</sup> <sup>+</sup> *<sup>z</sup>*68 + 1272 · *<sup>z</sup>*<sup>58</sup> <sup>+</sup> *<sup>z</sup>*66 + 301 · *<sup>z</sup>*<sup>60</sup> <sup>+</sup> *<sup>z</sup>*64 + <sup>2136</sup> · *<sup>z</sup>*<sup>62</sup> *<sup>A</sup>*(*S*31) *<sup>B</sup>*<sup>61</sup> = <sup>1</sup> <sup>+</sup> *<sup>z</sup>*124 <sup>+</sup> <sup>2</sup> · *<sup>z</sup>*<sup>62</sup> *<sup>A</sup>*(*S*61) *<sup>B</sup>*<sup>61</sup> = <sup>1</sup> <sup>+</sup> *<sup>z</sup>*124 <sup>+</sup> <sup>2</sup> · *<sup>z</sup>*<sup>62</sup>

The weight distributions of *B*<sup>61</sup> and their modular congruence are shown in Table 9.18.

# *Weight Distributions of Quadratic Residues Codes for Primes 151 and 167*

See Tables 9.19 and 9.20


**Table 9.19** Weight distributions of QR and extended QR codes of prime 151


**Table 9.20** Weight distributions of QR and extended QR codes of prime 167

# **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the book's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the book's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Chapter 10 Historical Convolutional Codes as Tail-Biting Block Codes**

#### **10.1 Introduction**

In the late 1950s, a branch of error-correcting codes known as convolutional codes [1, 6, 11, 14] was explored almost independently of block codes and each discipline had their champions. For convolutional codes, sequential decoding was the norm and most of the literature on the subject was concerned with the performance of practical decoders and different decoding algorithms [2]. There were few publications on the theoretical analysis of convolutional codes. In contrast, there was a great deal of theory about linear, binary block codes and not a great deal about decoders, except for hard decision decoding of block codes. Soft decision decoding of block codes was considered to be quite impractical, except for trivial, very short codes.

With Andrew Viterbi's invention [13] of the maximum likelihood decoder in 1967, featuring a trellis based decoder, an enormous impetus was given to convolutional codes and soft decision decoding. Interestingly, the algorithm itself, for solving the travelling saleman's problem [12], had been known since 1960. Consequently, interest in hard decision decoding of convolutional codes waned in favour of soft decision decoding. Correspondingly, block codes were suddenly out of fashion except for the ubiquitous Reed–Solomon codes.

For sequential decoder applications, the convolutional codes used were systematic codes with one or more feedforward polynomials, whereas for applications using a Viterbi decoder, the convolutional codes were optimised for largest, minimum Hamming distance between codewords, *d f ree*, for a given memory (the highest degree of the generator polynomials defining the code). The result is always a non-systematic code. It should be noted that in the context of convolutional codes, the minimum Hamming distance between codewords is understood to be evaluated over the constraint length, the memory of the code. This is traditionally called *dmin*. This is rather confusing when comparing the minimum Hamming distance of block codes with that of convolutional codes. A true comparison should compare the *d f ree* of a convolutional code to the *dmin* of a block code, for a given code rate.


**Table 10.1** Best rate <sup>1</sup> <sup>2</sup> convolutional codes designed for Viterbi decoding

Since the early 1960s, a lot of work has been carried out on block codes and convolutional codes for applications in deep space communications, primarily because providing a high signal-to-noise ratio is so expensive. Error-correcting codes allowed the signal to noise ratio to be reduced.

The first coding arrangement implemented for space [6, 9] was part of the payload of Pioneer 9 which was launched into space in 1968. The payload featured a systematic, convolutional code designed by Lin and Lyne [7] with a *d f ree* of 12 and memory of 20. The generator polynomial is

$$\tau(\mathbf{x}) = 1 + \mathbf{x} + \mathbf{x}^2 + \mathbf{x}^5 + \mathbf{x}^6 + \mathbf{x}^8 + \mathbf{x}^9 + \mathbf{x}^{12} + \mathbf{x}^{13} + \mathbf{x}^{14} + \mathbf{x}^{16} + \mathbf{x}^{17} + \mathbf{x}^{18} + \mathbf{x}^{19} + \mathbf{x}^{20}.$$

This convolutional code was used with soft decision, sequential decoding featuring the Fano algorithm [2] to realise a coding gain of 3 dB. Interestingly, it was initially planned as a communications experiment and not envisaged to be used operationally to send telemetry data to Earth. However, its superior performance over the standard operational communications system which featured uncoded transmission meant that it was always used instead of the standard system.

In 1969, the Mariner'69 spacecraft was launched with a first order Reed–Muller *(*32*,* 6*,* 16*)* code [8] equivalent to the extended *(*32*,* 6*,* 16*)* cyclic code. A maximum likelihood correlation decoder was used. The coding gain was 2*.*2 dB [9].

By the mid 1970s, the standard for soft decision decoding on the AWGN channel notably applications for satellite communications and space communications was to use convolutional codes with Viterbi decoding, featuring the memory 7 code listed in Table 10.1. The generator polynomials are *r*1*(x)* = 1 + *x* + *x* <sup>2</sup> + *x* <sup>3</sup> + *x* <sup>6</sup> and *r*2*(x)* = 1+*x* <sup>2</sup>+*x* <sup>3</sup>+*x* <sup>5</sup>+*x* <sup>6</sup> convolutional code, best known, in octal representation, as the *(*171*,* 133*)* code. The best half rate convolutional codes designed to be used with Viterbi decoding [1, 6] are tabulated in Table 10.1.

The *(*171*,* 133*)* code with Viterbi soft decision decoding featured a coding gain of 5*.*1 dB at 10−<sup>5</sup> bit error rate which was around 2 dB better than its nearest rival featuring a high memory convolutional code and hard decision, sequential decoding. The *(*171*,* 133*)* convolutional code is one of the recommended NASA Planetary Standard Codes [3].

However, more coding gain was achieved by concatenating the *(*171*,* 133*)* convolutional code with a *(*255*,* 233*)* Reed–Solomon (RS) code which is able to correct 16 symbol errors, each symbol being 8 bits. Quite a long interleaver needs to be used between the Viterbi decoder output and the RS decoder in order to break up the occasional error bursts which are output from the Viterbi decoder. Interleaver lengths vary from 4080 bits to 16320 bits and with the longest interleaver the coding gain of the concatenated arrangement is 7*.*25 dB, ( *Eb <sup>N</sup>*<sup>0</sup> <sup>=</sup> <sup>2</sup>*.*35 dB at 10−<sup>5</sup> bit error rate), and it is a CCSDS [3] standard for space communications.

## **10.2 Convolutional Codes and Circulant Block Codes**

It is straightforward to show that a double-circulant code is a half rate, tail-biting, feedforward convolutional code. Consider the Pioneer 9, half rate, convolutional code invented by Lin and Lyne [7] with generator polynomial

$$r(\mathbf{x}) = \mathbf{l} + \mathbf{x} + \mathbf{x}^2 + \mathbf{x}^3 + \mathbf{x}^6 + \mathbf{x}^8 + \mathbf{x}^9 + \mathbf{x}^{12} + \mathbf{x}^{13} + \mathbf{x}^{14} + \mathbf{x}^{16} + \mathbf{x}^{17} + \mathbf{x}^{18} + \mathbf{x}^{19} + \mathbf{x}^{20}$$

For a semi-infinite data sequence defined by *d(x)*, the corresponding codeword, *c(x)*, of the convolutional code consists of

$$c(\mathbf{x}) = d(\mathbf{x}) \| d(\mathbf{x}) r(\mathbf{x}) \tag{10.1}$$

where represents interlacing of the data polynomial representing the data sequence and the parity polynomial representing the sequence of parity bits.

The same generator polynomial can be used to define a block code of length 2*n*, a *(*2*n, n)* double-circulant code with a codeword consisting of

$$c(\mathbf{x}) = d(\mathbf{x}) \| d(\mathbf{x}) r(\mathbf{x}) \text{ modulo } (1 + \mathbf{x''}) \tag{10.2}$$

(Double-circulant codewords usually consist of one circulant followed by the second but it is clear that an equivalent code is obtained by interlacing the two circulants instead.)

While comparing Eq. (10.1) with (10.2) as *n* → ∞, it can be seen that the same codewords will be obtained. For finite *n*, it is apparent that the tail of the convolution of *d(x)* and *r(x)* will wrap around adding to the beginning as in a tail-biting convolutional code. It is also clear that if *n* is sufficiently long, only the Hamming weight of long convolutions, will be affected by the wrap around and these long convolution results will be of high Hamming weight anyway leading to the conclusion that if *n* is sufficiently long the *dmin* of the circulant code will be the same as the *d f ree* of the convolutional code. Indeed, the low weight spectral terms of the two codes will be identical, as is borne out by codeword enumeration using the methods described in Chap. 5.

For the Pioneer 9 code, having a *d f ree* of 12, a double-circulant code with *dmin* also equal to 12 can be obtained with *n* as low as 34, producing a (68, 34, 12) code. It is noteworthy that this is not a very long code, particularly by modern standards.

Codewords of the double-circulant code are given by

$$c(\mathbf{x}) = d(\mathbf{x})|d(\mathbf{x})(\mathbf{l} + \mathbf{x} + \mathbf{x}^2 + \mathbf{x}^3 + \mathbf{x}^6 + \mathbf{x}^8 + \mathbf{x}^9 + \mathbf{x}^{12} + \mathbf{x}^{13} + \mathbf{x}^{14}$$

$$+ \mathbf{x}^{16} + \mathbf{x}^{17} + \mathbf{x}^{18} + \mathbf{x}^{19} + \mathbf{x}^{20}) \text{ modulo } (\mathbf{l} + \mathbf{x}^{34})\tag{10.3}$$

As a double-circulant block code, this code can be soft decision decoded, with near maximum likelihood decoding using an extended Dorsch decoder, described in Chap. 15. The results for the AWGN channel are shown plotted in Fig. 10.1. Also plotted in Fig. 10.1 are the results obtained with the same convolutional code realised as a (120, 60, 12) double-circulant code which features less wrap around effects compared to the (68, 34, 12) code.

Using the original sequential decoding with 8 level quantisation of the soft decisions realised a coding gain of 3 dB at a BER of 5×10−4. Using the modified Dorsch decoder with this code can realise a coding gain of over 5 dB at a BER of 5 × 10−<sup>4</sup> and over 6 dB at a BER of 10−<sup>6</sup> as is evident from Fig. 10.1. Moreover, there is no need for termination bits with the tail-biting arrangement. However, it should be noted that the state of the art, modified Dorsch decoder with soft decision decoding

**Fig. 10.1** BER performance of the Pioneer 9 convolutional code encoded as a *(*68*,* 34*,* 12*)* or *(*120*,* 60*,* 12*)* double-circulant code with soft and hard decision, extended Dorsch decoding in comparison to uncoded QPSK

needs to evaluate up to 500*,*000 codewords per received vector for the *(*68*,* 34*,* 12*)* double-circulant code realisation and up to 1*,*000*,*000 codewords per received vector for the *(*120*,* 60*,* 12*)* double-circulant code version in order to achieve near maximum likelihood decoding. Figure 10.1 also shows the hard decision decoding performance realised with the modified, hard decision Dorsch decoder, also described in Chap. 15. The *(*120*,* 60*,* 12*)* double-circulant code version, has a degradation of 2*.*3 dB at 10−<sup>4</sup> BER compared to soft decision decoding, but still achieves a coding gain of 3*.*3 dB at 10−<sup>4</sup> BER. Similarly, the *(*68*,* 34*,* 12*)* double-circulant code version, has a degradation of 2*.*2 dB at 10−<sup>4</sup> BER compared to soft decision decoding, but still achieves a coding gain of 2*.*3 dB at 10−<sup>4</sup> BER.

The conclusion to be drawn from Fig. 10.1 is that the Pioneer 9 coding system was limited not by the design of the code but by the design of the decoder. However to be fair, the cost of a Dorsch decoder would have been considered beyond reach back in 1967.

It is interesting to discuss the differences in performance between the *(*68*,* 34*,* 12*)* and *(*120*,* 60*,* 12*)* double-circulant code versions of the Pioneer 9 convolutional code. Both have a *dmin* of 12. However the number of weight 12 codewords, the multiplicities of weight 12 codewords of the codes' weight distributions, is higher for the *(*68*,* 34*,* 12*)* double-circulant code version due to the wrap around of the second circulant which is only of length 34. The tails of the circulants of codewords of higher weight than 12 do suffer some cancellation with the beginning of the circulants. In fact, exhaustive weight spectrum analysis, (see Chaps. 5 and 13 for description of the different methods that can be used), shows that the multiplicity of weight 12 codewords is 714 for the *(*68*,* 34*,* 12*)* code and only 183 for the *(*120*,* 60*,* 12*)* code.

Moreover, the covering radius of the *(*68*,* 34*,* 12*)* code has been evaluated and found to be 10 indicating that this code is well packed, whereas the covering radius of the *(*120*,* 60*,* 12*)* code is much higher at 16 indicating that the code is not so well packed. Indeed the code rate of the *(*120*,* 60*,* 12*)* code can be increased without degrading the minimum Hamming distance because with a covering radius of 16 at least one more information bit may be added to the code.

With maximum likelihood, hard decision decoding, which the modified Dorsch decoder is able to achieve, up to 10 hard decision errors can be corrected with the *(*68*,* 34*,* 12*)* code in comparison with up to 16 hard decision errors correctable by the *(*120*,* 60*,* 12*)* code. Note that these are considerably higher numbers of correctable errors in both cases than suggested by the *d f ree* of the code (only five hard decision errors are guaranteed to be correctable). This is a recurrent theme for maximum likelihood, hard decision decoding of codes, as discussed in Chap. 3, compared to bounded distance decoding.

It is also interesting to compare the performance of other convolutional codes that have been designed for space applications and were originally intended to be used with sequential decoding. Of course now we have available the far more powerful (and more signal processing intensive) modified Dorsch decoder, which can be used with any linear code.

Massey and Costello [6, 10] constructed a rate <sup>1</sup> <sup>2</sup> , memory 31 non-systematic code which was more powerful than any systematic code with the same memory and had the useful property that the information bits could be obtained from the two convolutionally encoded parity streams just by adding them together, modulo 2. The necessary condition for this property is that the two generator polynomials differ only in a single coefficient. The two generator polynomials, *r*0*(x)* and *r*1*(x)* may be described by the exponents of the non-zero coefficients:

$$r\_0(\mathbf{x}) \leftarrow \{0, 1, 2, 4, 5, 7, 8, 9, 11, 13, 14, 16, 17, 18, 19, 21, 22, 23, 24, 25, 27, 28, 29, 31\}$$

$$r\_1(\mathbf{x}) \leftarrow \{0, 2, 4, 5, 7, 8, 9, 11, 13, 14, 16, 17, 18, 19, 21, 22, 23, 24, 25, 27, 28, 29, 31\}$$

As can be seen the two generator polynomials differ only in the coefficient of *x*. This code has a *d f ree* of 23 and can be realised as a double-circulant *(*180*,* 90*,* 23*)* code from the tail-biting version of the same convolutional code. This convolutional code has exceptional performance and in double-circulant form it, of course, may be decoded using the extended Dorsch decoder. The performance of the code in *(*180*,* 90*,* 23*)* form, for the soft decision and hard decision AWGN channel, is shown in Fig. 10.2. For comparison purposes, the performances of the Pioneer 9 codes are also shown in Fig. 10.2. Shorter double-circulant code constructions are possible from this convolutional code in tail-biting form, without compromising the *dmin* of the double-circulant code. The shortest version is the *(*166*,* 83*,* 23*)* double-circulant code.

**Fig. 10.2** BER performance of the Massey Costello convolutional code in *(*180*,* 90*,* 23*)* doublecirculant code form for the AWGN channel, using soft and hard decisions, with extended Dorsch decoding

By truncating the generator polynomials, *r*0*(x)* and *r*1*(x)* above, a reduced memory convolutional code with memory 23 and *d f ree* of 17 can be obtained as discussed by Massey and Lin [6, 10] which still has the non-systematic, quick decoding property. The generator polynomials are given by the exponents of the non-zero coefficients:

$$\begin{aligned} \hat{r}\_0(\mathbf{x}) &\leftarrow \{0, 1, 2, 4, 5, 7, 8, 9, 11, 13, 14, 16, 17, 18, 19, 21, 22, 23\} \\ \hat{r}\_1(\mathbf{x}) &\leftarrow \{0, 2, 4, 5, 7, 8, 9, 11, 13, 14, 16, 17, 18, 19, 21, 22, 23\} \end{aligned}$$

A *(*160*,* 80*,* 17*)* double-circulant code can be obtained from the tail-biting version of this convolutional code. In fact, many double-circulant codes with high *dmin* can be obtained from tail-biting versions of convolutional codes.

It is straightforward to write a program in C++ which searches for the generator polynomials that produce the convolutional codes with the highest values of *d f ree*. The only other constraint is that the the generator polynomials need to be relatively prime to each other, that is, the GCD of the generator polynomials needs to be 1 in order to avoid a catastrophic code [6]. However, it is also necessary in selecting the generator polynomials that the wrap around effects of the circulants are taken into account otherwise the *dmin* of the double-circulant code is not as high as the *d f ree* of the convolutional code from which it is derived. Indeed to construct a good code in this way with high *d f ree* and high *dmin*, it has to be constructed as a tailbiting convolutional code right from the start. One example of a good tail-biting convolutional code that has been found in this way has generator polynomials, *r*0*(x)* and *r*1*(x)* given by the exponents of the non-zero coefficients:

$$\begin{aligned} r\_0(\mathbf{x}) &\leftarrow \{0, 2, 5, 8, 9, 10, 12, 13, 14, 15, 27\} \\ r\_1(\mathbf{x}) &\leftarrow \{0, 1, 2, 3, 4, 5, 7, 8, 11, 12, 16, 18, 20, 23, 27\} \end{aligned}$$

This code has a memory of 27 and a *d f ree* of 26. It may be realised in double-circulant form as a *(*180*,* 90*,* 26*)* double-circulant code and weight spectrum analysis shows that this code has the same *dmin* of 26 as the best-known code with the same code parameters [4]. The two polynomials*r*0*(x)* and *r*1*(x)* factorise into polynomials with the following exponents of the non-zero coefficients:

$$\begin{aligned} r\_0(\mathbf{x}) &\leftarrow \{0, 3, 5\} \{0, 2, 3, 5, 6, 7, 8, 10, 13, 14, 16, 17, 18, 20, 22\} \\ r\_1(\mathbf{x}) &\leftarrow \{0, 3, 5, 6, 8\} \{0, 1, 3, 4, 5, 6, 8\} \{0, 2, 4, 7, 11\} \end{aligned}$$

It can be seen that neither polynomial has a common factor and so the GCD is 1. Correspondingly, the convolutional code is not a catastrophic code.

As well as constructing double-circulant codes from convolutional codes, doublecirculant codes may be used to construct good convolutional codes. The idea of generating convolutional codes from good block codes is not that new. Massey et al. in 1973 generated a convolutional code for space communications from a *(*89*,* 44*,* 18*)* quadratic residue cyclic code [5, 6]. As described in Chap. 9, prime numbers which are congruent to ±3 modulo 8 may be used to generate double-circulant codes using the quadratic residues to construct one circulant, the other circulant being the identity circulant; the length of the circulants are equal to the prime number.

Particularly, good double-circulant codes are obtained in this way as discussed in Chap. 9. For example, the prime number 67 can be used to generate a *(*134*,* 67*,* 23*)* double-circulant code with the circulants defined by the two polynomials with the following exponents of the non-zero coefficients:

$$\begin{aligned} r\_0(\mathbf{x}) &\leftarrow \{0\} \\ r\_1(\mathbf{x}) &\leftarrow \{0, 1, 4, 6, 9, 10, 14, 15, 16, 17, 19, 21, 22, 23, 24, 25, 26, 29, 33, 35, 36, 37, 37, 38, 38\} \end{aligned}$$

$$\text{37, 39, 40, 47, 49, 54, 55, 56, 59, 60, 62, 64, 65}$$

Using these two polynomials as the generator polynomials for a <sup>1</sup> <sup>2</sup> rate convolutional code, a systematic convolutional code having a *d f ree* of 30 is obtained. Interestingly, deriving another double-circulant code from the tail-biting version of this convolutional code only produces good results when the circulants are exactly equal to 67, thereby reproducing the original code. For longer circulants, the *dmin* is degraded unless the circulants are much longer. It is found that the circulants have to be as long as 110 to produce a *(*220*,* 110*,* 30*)* double-circulant code having a *dmin* equal to that of the original convolutional code. Moreover, this is a good code because the code has the same parameters as the corresponding best-known code [4].

A double-circulant code may also be used to derive a non-systematic convolutional code with much smaller memory and a *d f ree* equal to the *dmin* of the double-circulant code by selecting a codeword of the double-circulant code which features low-degree polynomials in each circulant. It is necessary to check that these polynomials are relatively prime otherwise a catastrophic convolutional code is produced. In this event a new codeword is selected. The code produced is a non-systematic convolutional code with memory equal to the highest degree of the two circulant polynomials. For example, a memory 41 non-systematic convolutional code can be derived from a memory 65, systematic convolutional code based on the *(*134*,* 67*,* 23*)* doublecirculant code with the following exponents of the non-zero coefficients:

$$\begin{aligned} r\_0(\mathbf{x}) &\leftarrow \{0\} \\ r\_1(\mathbf{x}) &\leftarrow \{0, 1, 4, 6, 9, 10, 14, 15, 16, 17, 19, 21, 22, 23, 24, 25, 26, 29, 33, 35\} \\ &\quad \{36, 37, 39, 40, 47, 49, 54, 55, 56, 59, 60, 62, 64, 65\} \end{aligned}$$

Codeword analysis of the double-circulant code is carried out to find the low memory generator polynomials. The following two generator polynomials were obtained from the two circulant polynomials making up a weight 23 codeword of the *(*134*,* 67*,* 23*)* code:

$$\begin{aligned} r\_0(\mathbf{x}) &\leftarrow \{0, 1, 2, 4, 5, 10, 12, 32, 34, 36, 39, 41\} \\ r\_1(\mathbf{x}) &\leftarrow \{0, 2, 4, 13, 19, 24, 25, 26, 33, 35, 37\} \end{aligned}$$

In another example, the outstanding *(*200*,* 100*,* 32*)* extended cyclic quadratic residue code may be put in double-circulant form using the following exponents of the non-zero coefficients:

*r*0*(x)* ← {0} *r*1*(x)* ← {0*,* 1*,* 2*,* 5*,* 6*,* 8*,* 9*,* 10*,* 11*,* 15*,* 16*,* 17*,* 18*,* 19*,* 20*,* 26*,* 27*,* 28*,* 31*,* 34*,* 35*,* 37*,* 38*,* 39*,* 42*,* 44*,* 45*,* 50*,* 51*,* 52*,* 53*,* 57*,* 58*,* 59*,* 64*,* 66*,* 67*,* 70*,* 73*,* 75*,* 76*,* 77*,* 80*,* 82*,* 85*,* 86*,* 89*,* 92*,* 93*,* 97*,* 98}

Enumeration of the codewords shows that there is a weight 32 codeword that defines the generator polynomials of a memory 78, non-systematic convolutional code. The codeword consists of two circulant polynomials, the highest degree of which is 78. The generator polynomials have the following exponents of the nonzero coefficients:

*r*0*(x)* ← {0*,* 2*,* 3*,* 8*,* 25*,* 27*,* 37*,* 44*,* 50*,* 52*,* 55*,* 57*,* 65*,* 66*,* 67*,* 69*,* 74*,* 75*,* 78} *r*1*(x)* ← {0*,* 8*,* 14*,* 38*,* 49*,* 51*,* 52*,* 53*,* 62*,* 69*,* 71*,* 72*,* 73}

The non-systematic convolutional code that is produced has a *d f ree* of 32 equal to the *dmin* of the double-circulant code. Usually, it is hard to verify high values of *d f ree* for convolutional codes, but in this particular case, as the convolutional code has been derived from the *(*200*,* 100*,* 32*)* extended quadratic residue, doublecirculant code which is self-dual and also fixed by the large projective special linear group *PSL*2*(*199*)* the *dmin* of this code has been proven to be 32 as described in Chap. 9. Thus, the non-systematic convolutional code that is produced has to have a *d f ree* of 32.

#### **10.3 Summary**

Convolutional codes have been explored from a historical and modern perspective. Their performance, as traditionally used, has been compared to the performance realised using maximum likelihood decoding featuring an extended Dorsch decoder with the convolutional codes implemented as tail-biting block codes. It has been shown that the convolutional codes designed for space applications and sequential decoding over 40 years ago were very good codes, comparable to the best codes known today. The performance realised back then was limited by the sequential decoder as shown by the presented results. An additional 2 dB of coding gain could have been realised using the modern, extended Dorsch decoder instead of the sequential decoder. However back then, this decoder had yet to be discovered and was probably too expensive for the technology available at the time.

It has also been shown that convolutional codes may be used as the basis for designing double-circulant block codes and vice versa. In particular, high, guaranteed values of *d f ree* may be obtained by basing convolutional codes on outstanding doublecirculant codes. A memory 78, non-systematic, half rate convolutional code with a *d f ree* of 32 was presented based on the *(*200*,* 100*,* 32*)* extended quadratic residue, double-circulant code.

## **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the book's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the book's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Chapter 11 Analogue BCH Codes and Direct Reduced Echelon Parity Check Matrix Construction**

#### **11.1 Introduction**

Analogue error-correcting codes having real and complex number coefficients were first discussed by Marshall [2]. Later on Jack Wolf [3] introduced Discrete Fourier Transform (DFT) codes having complex number coefficients and showed that an (*n*, *k*) DFT code can often correct up to *n* − *k* − 1 errors using a majority voting type of decoder. The codes are first defined and it is shown that (*n*, *k*) DFT codes have coordinate coefficients having complex values. These codes have a minimum Hamming distance of *n* − *k* + 1 and are Maximum Distance Separable (MDS) codes. The link between the Discrete Fourier Transform and the Mattson–Solomon polynomial is discussed and it is shown that the parity check algorithm used to generate DFT codes can be applied to all BCH codes including Reed–Solomon codes simply by switching from complex number arithmetic to Galois Field arithmetic. It is shown that it is straightforward to mix together quantised and non-quantised codeword coefficients which can be useful in certain applications. Several worked examples are described including that of analogue error-correction encoding and decoding being applied to stereo audio waveforms (music).

In common with standard BCH or Reed–Solomon (RS) codes, it is shown that parity check symbols may be calculated for any *n* − *k* arbitrary positions in each codeword and an efficient method is described for doing this. A proof of the validity of the method is given.

#### **11.2 Analogue BCH Codes and DFT Codes**

In a similar manner to conventional BCH codes, a codeword of an analogue (*n*, *k*) BCH code is defined as

$$c(\mathbf{x}) = c\_0 + c\_1 \mathbf{x} + c\_2 \mathbf{x}^2 + c\_3 \mathbf{x}^3 + c\_4 \mathbf{x}^4 + c\_5 \mathbf{x}^5 + \dots + c\_{n-1} \mathbf{x}^{n-1}$$

© The Author(s) 2017 M. Tomlinson et al., *Error-Correction Coding and Decoding*, Signals and Communication Technology, DOI 10.1007/978-3-319-51103-0\_11

where

$$c(x) = g(x)d(x)$$

*g*(*x*) is the generator polynomial of the code with degree *n* − *k* and *d*(*x*) is any data polynomial of degree less than *k*. Correspondingly,

$$\log(\mathbf{x}) = \mathbf{g}\_0 + \mathbf{g}\_1 \mathbf{x} + \mathbf{g}\_2 \mathbf{x}^2 + \dots + \mathbf{g}\_{n-k} \mathbf{x}^{n-k}$$

and

$$d(\mathbf{x}) = d\_0 + d\_1 \mathbf{x} + d\_2 \mathbf{x}^2 + \dots + d\_{k-1} \mathbf{x}^{k-1}$$

The coefficients of *c*(*x*) are complex numbers from the field of complex numbers. A parity check polynomial *h*(*x*) is defined, where

$$h(\mathbf{x}) = h\_0 + h\_1 \mathbf{x} + h\_2 \mathbf{x}^2 + h\_3 \mathbf{x}^3 + h\_4 \mathbf{x}^4 + h\_5 \mathbf{x}^5 + \dots + h\_{n-1} \mathbf{x}^{n-1}$$

where

$$h(\mathbf{x}) \mathbf{g}(\mathbf{x}) \bmod (\mathbf{x}^n - 1) = 0$$

and accordingly,

$$h(\mathbf{x})c(\mathbf{x}) \bmod (\mathbf{x}^n - 1) = 0$$

The generator polynomial and the parity check polynomial may be defined in terms of the Discrete Fourier Transform or equivalently by the Mattson–Solomon polynomial.

**Definition 11.1** (*Definition of Mattson–Solomon polynomial*) The Mattson– Solomon polynomial of any polynomial *a*(*x*) is the linear transformation of *a*(*x*) to *A*(*z*) and is defined by [1],

$$A(z) = \text{MS}(a(\mathbf{x})) = \sum\_{i=0}^{n-1} a(\alpha^{-i}) \, z^i \tag{11.1}$$

The inverse Mattson–Solomon polynomial or inverse Fourier transform is:

$$a(\mathbf{x}) = \mathbf{M} \mathbf{S}^{-1}(A(\mathbf{z})) = \frac{1}{n} \sum\_{i=0}^{n-1} A(\alpha^i) \ge^i \tag{11.2}$$

α is a primitive root of unity with order *n* and for analogue BCH codes

$$\alpha = e^{j\frac{2\pi}{n}} \tag{11.3}$$

where *j* = (−1) 1 2 . In terms of a narrow sense, primitive BCH code with a generator polynomial of *g*(*x*), the coefficients of G(z) are all zero from *z*<sup>0</sup> through *z<sup>n</sup>*−*k*−<sup>1</sup> and the coefficients of H(z) are all zero from *z<sup>n</sup>*−*<sup>k</sup>* through *z<sup>n</sup>*−1. Consequently, it follows that the coefficient by coefficient product of *G*(*z*) and *H*(*z*) represented by -

$$(G(z)\odot H(z) = \sum\_{j=0}^{n-1} (G\_j \odot H\_j) \, z^j = 0 \tag{11.4}$$

The nonzero terms of *H*(*z*) extend from *z*<sup>0</sup> through to *zn*−*k*−<sup>1</sup> and a valid parity check matrix in the well known form is:

$$\mathbf{H} = \begin{bmatrix} 1 & 1 & 1 & \dots & 1 \\ 1 & \alpha^1 & \alpha^2 & \dots & \alpha^{n-1} \\ 1 & \alpha^2 & \alpha^4 & \dots & \alpha^{2(n-1)} \\ 1 & \alpha^3 & \alpha^6 & \dots & \alpha^{3(n-1)} \\ \vdots & \dots & \dots & \dots & \dots & \dots \\ 1 & \alpha^{n-k-1} & \alpha^{2(n-k-1)} & \dots & \alpha^{n-k-1(n-1)} \end{bmatrix}$$

It will be noticed that each row of this matrix is simply given by the inverse Mattson– Solomon polynomial of *H*(*z*), where

$$\begin{array}{ll} H(z) = & 1\\ H(z) = & z\\ H(z) = & z^2\\ H(z) = & \dots\\ H(z) = z^{n-k-1} \end{array} \tag{11.5}$$

Consider *H*(*z*) = *z*−α*<sup>i</sup>* , the inverse Mattson–Solomon polynomial produces a parity check equation defined by

$$1 - \alpha^i \; \alpha^1 - \alpha^i \; \alpha^2 - \alpha^i \; \dots \; 0 \; \dots \; \alpha^{n-1} - \alpha^i$$

Notice that this parity check equation may be derived from linear combinations of the first two rows of **H** by multiplying the first row by α*<sup>i</sup>* before subtracting it from the second row of **H**. The resulting row may be conveniently represented by

$$\alpha^{a\_0}\alpha^{a\_1}\alpha^{a\_2}\alpha^{a\_3}\dots 0\dots \alpha^{a\_{n-2}}\alpha^{a\_{n-1}}$$

It will be noticed that the *ith* coordinate of the codeword is multiplied by zero, and hence the parity symbol obtained by this parity check equation is independent of the value of the *ith* coordinate. Each one of the other coordinates is multiplied by a nonzero value. Hence any one of these *n* − 1 coordinates may be solved using this parity check equation in terms of the other *n* − 2 coordinates involved in the equation.

Similarly, considering *H*(*z*) = *z* −α *<sup>j</sup>* , the inverse Mattson–Solomon polynomial produces a parity check equation defined by

$$1 - \alpha^j \; \alpha^1 - \alpha^j \; \alpha^2 - \alpha^j \; \dots \; 0 \; \dots \; \alpha^{n-1} - \alpha^j$$

and this may be conveniently represented by

$$
\alpha^{b\_0} \alpha^{b\_1} \alpha^{b\_2} \alpha^{b\_3} \dots 0 \dots \alpha^{b\_{n-2}} \alpha^{b\_{n-1}}
$$

Now the *jth* coordinate is multiplied by zero and hence the parity symbol obtained by this parity check equation is independent of the value of the *jth* coordinate.

Developing the argument, if we consider *H*(*z*) = (*z* − α*<sup>i</sup>* )(*z* − α *<sup>j</sup>* ), the inverse Mattson–Solomon polynomial produces a parity check equation defined by

$$
\alpha^{a\_0}\alpha^{b\_0}\alpha^{a\_1}\alpha^{b\_1}\dots 0\dots 0 \dots 0 \dots \alpha^{a\_{n-1}}\alpha^{b\_{n-1}}
$$

This parity check equation has zeros in the *ith* and *jth* coordinate positions and as each one of the other coordinates is multiplied by a nonzero value, any one of these *n* − 2 coordinates may be solved using this parity check equation in terms of the other *n* − 3 coordinates involved in the equation.

Proceeding in this way, for *H*(*z*) = (*z*−α*<sup>i</sup>* )(*z*−α *<sup>j</sup>* )(*z*−α*<sup>k</sup>* ), the inverse Mattson– Solomon polynomial produces a parity check equation which is independent of the *ith*, *jth* and *kth* coordinates and these coordinate positions may be arbitrarily chosen. The parity check matrix is

$$\mathbf{H}\_{\mathbf{m}} = \begin{bmatrix} 1 & 1 & 1 & 1 & 1 & 1 & \dots & 1 \\ \alpha^{u\_0} \ \alpha^{u\_1} \ \alpha^{u\_2} \ \alpha^{u\_3} \ \alpha^{u\_4} \ 0 & \dots \ \alpha^{u\_{n-1}} \\ \alpha^{v\_0} & 0 & \alpha^{v\_2} \ \alpha^{v\_3} \ \alpha^{v\_4} & 0 & \dots \ \alpha^{v\_{n-1}} \\ \alpha^{w\_0} & 0 & \alpha^{w\_2} & 0 & \alpha^{w\_4} & 0 & \dots \ \alpha^{w\_{n-1}} \end{bmatrix}.$$

The point here is that this parity check matrix **Hm** has been obtained from linear combinations of the original parity check matrix **H** and all parity check equations from either **H** or **Hm** are satisfied by codewords of the code.

The parity check matrix **Hm** may be used to solve for 4 parity check symbols in 4 arbitrary coordinate positions defined by the *ith*, *jth* and *kth* coordinate positions plus any one of the other coordinate positions which will be denoted as the *l th* position. The coordinate value in the *l th* position is solved first using the last equation. Parity symbols in the *ith*, *jth* and *kth* positions are unknown but this does not matter as these are multiplied by zero. The third parity check equation is used next to solve for the parity symbol in the *kth* position. Then, the second parity check equation is used to solve for the parity symbol in the *jth* position and lastly, the first parity check equation is used to solve for the parity symbol in the *ith* position. The parity check matrix values, for *s* = 0 through to *n* − 1, are given by:

$$\begin{cases} \alpha^{u\_s} = & \alpha^s - \alpha^i \\ \alpha^{v\_s} = & (\alpha^s - \alpha^i)(\alpha^s - \alpha^j) \\ \alpha^{w\_s} = (\alpha^s - \alpha^i)(\alpha^s - \alpha^j)(\alpha^s - \alpha^k) = \alpha^{v\_l}(\alpha^s - \alpha^k) \end{cases}$$

Codewords of the code may be produced by first deciding on the number of parity check symbols and their positions and then constructing the corresponding parity check matrix **Hm**. From the information symbols, the parity check symbols are calculated by using each row of **Hm** starting with the last row as described above.

In the above, there are 4 parity check rows and hence 4 parity check symbols which can be in any positions of the code. Clearly, the method can be extended to any number of parity check symbols. Any length of code may be produced by simply assuming coordinates are always zero, eliminating these columns from the parity check matrix. The columns of the parity check matrix may also be permuted to any order but the resulting code will not be cyclic.

It follows that with the *n* −*k* parity check equations constructed using the method above, codeword coordinates may be solved in any of *n* − *k* arbitrary positions. In the construction of each parity check equation there is exactly one additional zero compared to the previously constructed parity check equation. Hence there are *n* −*k* independent parity check equations in any of *n* − *k* arbitrary positions.

Since these equations are all from the same code the minimum Hamming distance of the code is *n* − *k* + 1 and the code is MDS. A system for the calculation of parity check symbols in arbitrary positions may be used for encoding or for the correction of erasures. A block diagram of such an encoder/erasures decoder is shown in Fig. 11.1.

**Fig. 11.1** The efficient encoder/erasures decoder realisation for BCH codes

When operating as an erasures decoder in Fig. 11.1, the List of Parity Symbols is replaced with a list of the erasures positions.

#### **11.3 Error-Correction of Bandlimited Data**

In many cases, the sampled data to be encoded with the analogue BCH code is already bandlimited or near bandlimited in which case, the higher frequency coefficients of the Mattson–Solomon polynomial, *D*(*z*) of the data polynomial *d*(*x*), consisting of successive PAM samples, will be zero or near zero. An important point here is that there is no need to add additional redundancy with additional parity check samples. In a sense the data, as PAM samples, already contains the parity check samples. Commonly, it is only necessary to modify a small number of samples to turn the sampled data into codewords of the analogue BCH code as illustrated in the example below. The broad sense BCH codes are used with the following parity check matrix, with <sup>α</sup> <sup>=</sup> *<sup>e</sup>*<sup>−</sup> *<sup>j</sup>* <sup>2</sup><sup>π</sup> *n* .

$$\mathbf{H}\_{\mathbf{f}} = \begin{bmatrix} 1 & \alpha^{\beta} & \alpha^{2\beta} & \dots & \alpha^{(n-1)\beta} \\ 1 & \alpha^{\beta+1} \alpha^{2(\beta+1)} & \dots \alpha^{((n-1)(\beta+1)} \\ 1 & \alpha^{\beta+2} \alpha^{2(\beta+2)} & \dots \alpha^{(n-1)(\beta+2)} \\ 1 & \alpha^{\beta+3} \alpha^{2(\beta+3)} & \dots \alpha^{(n-1)(\beta+3)} \\ \vdots & \vdots & \ddots & \dots & \dots \\ 1 & -1 & 1 & \dots & 1 \end{bmatrix} \tag{11.6}$$

Using this parity check matrix will ensure that the highest *n* − *k* Fourier coefficients will be zero. Several alternative procedures may be used. *n* − *k* samples in each sequence of *n* samples may be designated as parity symbols and solved using this parity check matrix following the procedure above for constructing the reduced echelon matrix so that the values of the designated parity samples may be calculated. An alternative, more complicated procedure, is for each constructed codeword, to allow the *<sup>n</sup>* <sup>−</sup> *<sup>k</sup>* parity samples to be in any of the *<sup>n</sup>*! *<sup>k</sup>*!(*n*−*k*)! combinations of positions and choose the combination which produces the minimum mean squared differences compared to the original *n* − *k* complex samples.

# **11.4 Analogue BCH Codes Based on Arbitrary Field Elements**

It is not necessary that the parity check matrix be based on increasing powers of α with parity check equations corresponding to the forcing of Fourier coefficients to be zero. An arbitrary ordering of complex field elements corresponding to permuted powers of <sup>α</sup> may be used. With <sup>α</sup> <sup>=</sup> *<sup>e</sup>*<sup>−</sup> *<sup>j</sup>* <sup>2</sup><sup>π</sup> *<sup>N</sup>* where *N* ≥ *n*, consider the parity check matrix

$$\mathbf{H}\_{\mathbf{a}} = \begin{bmatrix} \alpha\_{0} & \alpha\_{0} & \alpha\_{0} & \alpha\_{0} & \dots & \alpha\_{0} \\ \alpha\_{0} & \alpha\_{1} & \alpha\_{2} & \alpha\_{3} & \dots & \alpha\_{n-1} \\ \alpha\_{0} & \alpha\_{1}^{2} & \alpha\_{2}^{2} & \alpha\_{3}^{2} & \dots & \alpha\_{n-1}^{2} \\ \alpha\_{0} & \alpha\_{1}^{3} & \alpha\_{2}^{3} & \alpha\_{3}^{3} & \dots & \alpha\_{n-1}^{3} \\ \vdots & \dots & \dots & \dots & \dots & \dots & \dots \\ \alpha\_{0} & \alpha\_{1}^{n-k-1} & \alpha\_{2}^{n-k-1} & \alpha\_{3}^{n-k-1} & \dots & \alpha\_{n-1}^{n-k-1} \end{bmatrix}$$

The {α0, α1, α2, α3, ..., α*<sup>n</sup>*−1} complex number field elements are all distinct and arbitrary powers of α. Any combination of any *n* − *k* columns, or less, of this parity check matrix are independent because the matrix transpose is a Vandermonde matrix [1]. Consequently, the code is a (*n*, *k*, *n* − *k* + 1) MDS code.

Following the same procedure as outlined above to produce directly a reduced echelon parity check matrix **Hb** with zeros in arbitrary columns, for example in three columns headed by, α*a*, α*<sup>b</sup>* and α*c*.

$$\mathbf{H}\_{\mathbf{b}} = \begin{bmatrix} a\_0 & a\_0 & a\_0 & a\_0 & \dots & a\_0 \\ & (a\_0 - a\_d) & 0 & (a\_2 - a\_d) & (a\_3 - a\_d) & \dots & (a\_{n-1} - a\_d) \\ & (a\_0 - a\_d)(a\_0 - a\_b) & 0 & 0 & (a\_3 - a\_d)(a\_3 - a\_b) & \dots & (a\_{n-1} - a\_d)(a\_{n-1} - a\_b) \\ & (a\_0 - a\_d)(a\_0 - a\_b)(a\_0 - a\_c) & 0 & 0 & (a\_3 - a\_d)(a\_3 - a\_b)(a\_3 - a\_c) & \dots & 0 \\ \end{bmatrix}$$

The parity check equation corresponding to the fourth row of this parity check matrix is

$$\sum\_{i=0}^{n-1} (\alpha\_i - \alpha\_a)(\alpha\_i - \alpha\_b)(\alpha\_i - \alpha\_c)c\_i = 0 \tag{11.7}$$

where the analogue BCH codeword consists of *n* complex numbers

{*c*0, *c*1, *c*2, *c*3, ..., *cn*−<sup>1</sup>}

*k* of these complex numbers may be arbitrary, determined by the information source and *n* − *k* complex numbers are calculated from the parity check equations:

Defining

$$(\alpha\_i - \alpha\_a)(\alpha\_i - \alpha\_b)(\alpha\_i - \alpha\_c) = \alpha\_i^3 + \beta\_2 \alpha\_i^2 + \beta\_1 \alpha\_i^1 + \beta\_0$$

Parity check Eq. (11.7) becomes

$$\sum\_{i=0}^{n-1} \alpha\_i^3 c\_i + \beta\_2 \sum\_{i=0}^{n-1} \alpha\_i^2 c\_i + \beta\_1 \sum\_{i=0}^{n-1} \alpha\_i c\_i + \beta\_0 \sum\_{i=0}^{n-1} c\_i = 0 \tag{11.8}$$

This codeword is from the same code as defined by the parity check matrix **Ha** because using parity check matrix **Ha**, codewords satisfy the equations

$$\sum\_{i=0}^{n-1} \alpha\_i^3 c\_i = 0 \quad \sum\_{i=0}^{n-1} \alpha\_i^2 c\_i = 0 \quad \sum\_{i=0}^{n-1} \alpha\_i c\_i = 0 \quad \sum\_{i=0}^{n-1} c\_i = 0$$

and consequently the codewords defined by **Ha** satisfy (11.8) as

$$0 + \beta\_2 0 + \beta\_1 0 + \beta\_0 0 = 0$$

It is apparent that the reduced echelon matrix **Hb** consists of linear combinations of parity check matrix **Ha** and either matrix may be used to produce the same MDS, analogue BCH code.

#### **11.5 Examples**

#### *11.5.1 Example of Simple (***5***,* **3***,* **3***) Analogue Code*

This simple code is the extended analogue BCH code having complex sample values with α = *e j*2π <sup>4</sup> and uses the parity check matrix:

$$\mathbf{H} = \begin{bmatrix} 1 & 1 & 1 & 1 & 1 \\ 0 & 1 & j & -1 & -j \end{bmatrix}$$

This parity check matrix is used to encode 3 complex data values in the last 3 positions, viz (0.11 + 0.98 *j*, −0.22 − 0.88 *j*, 0.33 + 0.78 *j*). This produces the codeword:

$$(-0.2 - 0.22j, \ -0.02 - 0.66j, \ 0.11 + 0.98j, \ -0.22 - 0.88j, \ 0.33 + 0.78j)$$

Suppose the received vector has the last digit in error

$$1. (-0.2 - 0.22j, \quad -0.02 - 0.66j, \quad 0.11 + 0.98j, \quad -0.22 - 0.88j, \quad 0.4 + 0.9j)$$

Applying the first parity check equation produces 0.07 + 0.12 *j*. This result tells us that there is an error of 0.07+0.12 *j* in one of the received coordinates. Applying the second parity check equation produces 0.12−0.07 *j*. Since this is the error multiplied by − *j*, this tells us that the error is in the last coordinate. Subtracting the error from the last coordinate of the received vector produces (0.4 + 0.9 *j*) − (0.07 + 0.12 *j*) = 0.33 + 0.78 *j* and the error has been corrected.

# *11.5.2 Example of Erasures Correction Using (***15***,* **10***,* **4***) Binary BCH code*

This is an example demonstrating that the erasures decoder shown in Fig. 11.1 may be used to correct erasures in a binary BCH code as well as being able to correct erasures using an analogue BCH code.

The code is a binary BCH code of length *n* = 15, with binary codewords generated by the generator polynomial *g*(*x*) = (1+*x* <sup>3</sup> +*x* <sup>4</sup>)(1+*x*). The Galois field is GF(24) generated by the primitive root α, which is a root of the primitive polynomial 1+*x*+*x* <sup>4</sup> so that 1 + α + α<sup>4</sup> = 0, and the Galois field consists of the following table of 15 field elements, plus the element, 0.

One example of a codeword from the code is

$$c(\mathbf{x}) = \mathbf{x} + \mathbf{x}^3 + \mathbf{x}^4 + \mathbf{x}^6 + \mathbf{x}^8 + \mathbf{x}^9 + \mathbf{x}^{10} + \mathbf{x}^{11} \tag{11.9}$$

and consider that in a communication system, the codeword is received with erasures in positions in λ<sup>0</sup> = 5, λ<sup>1</sup> = 0 and λ<sup>2</sup> = 8, so that the received codeword is

$$\hat{\mathbf{c}}(\mathbf{x}) = \hat{\mathbf{c}}\_0 + \mathbf{x} + \mathbf{x}^3 + \mathbf{x}^4 + \hat{\mathbf{c}}\_5 \mathbf{x}^5 + \mathbf{x}^6 + \hat{\mathbf{c}}\_8 \mathbf{x}^8 + \mathbf{x}^9 + \mathbf{x}^{10} + \mathbf{x}^{11} \tag{11.10}$$

To find the party check equations to solve for the erasures, referring to Fig. 11.1, the first parity check equation, *h*0(*x*), the all 1's vector is stored in the Register. The second parity check equation *h*1(*x*) has zero for the coefficient of *x <sup>n</sup>*−λ0=*n*−<sup>5</sup> and is given by

$$h\_1(\mathbf{x}) = \sum\_{j=0}^{n-1} (\alpha^j - \alpha^{10}) \ge \mathbf{x}^j \tag{11.11}$$

Note that *hi*(*x*).*c*ˆ(*x*) = 0 and these polynomials are derived with the intention that the coefficient of *x* <sup>0</sup> will be evaluated. Referring to Fig. 11.1, *h*1(*x*) is stored in the corresponding Register. After substitution using Table 11.1, it is found that

$$\begin{aligned} h\_1(\mathbf{x}) &= \boldsymbol{\alpha}^{\mathcal{S}} + \boldsymbol{\alpha}^{\mathcal{S}}\mathbf{x} + \boldsymbol{\alpha}^{\mathcal{A}}\mathbf{x}^2 + \boldsymbol{\alpha}^{12}\mathbf{x}^3 + \boldsymbol{\alpha}^2\mathbf{x}^4 + \boldsymbol{\alpha}^{\mathcal{S}} \\ &+ \boldsymbol{\alpha}^{\mathcal{T}}\mathbf{x}^6 + \boldsymbol{\alpha}^6\mathbf{x}^7 + \boldsymbol{\alpha}\mathbf{x}^8 + \boldsymbol{\alpha}^{13}\mathbf{x}^9 + \boldsymbol{\alpha}^{14}\mathbf{x}^{11} \\ &+ \boldsymbol{\alpha}^3\mathbf{x}^{12} + \boldsymbol{\alpha}^9\mathbf{x}^{13} + \boldsymbol{\alpha}^{11}\mathbf{x}^{14} \end{aligned} \tag{11.12}$$

Notice that although the codeword is binary, the coefficients of this equation are from the full extension field of GF(16). The third parity check equation *h*2(*x*) has zero in position *n* − λ<sup>1</sup> = *n* − 0 and is given by

$$h\_2(\mathbf{x}) = \sum\_{j=0}^{n-1} (\alpha^j - 1) \ge \mathbf{x}^j \tag{11.13}$$

after evaluation

$$\begin{aligned} h\_2(\mathbf{x}) &= \alpha^4 \mathbf{x} + \alpha^8 \mathbf{x}^2 + \alpha^{14} \mathbf{x}^3 + \alpha \mathbf{x}^4 + \alpha^{10} \mathbf{x}^5 \\ &+ \alpha^{13} \mathbf{x}^6 + \alpha^9 \mathbf{x}^7 + \alpha^2 \mathbf{x}^8 + \alpha^7 \mathbf{x}^9 + \alpha^6 \mathbf{x}^{10} \\ &+ \alpha^{12} \mathbf{x}^{11} + \alpha^{11} \mathbf{x}^{12} + \alpha^6 \mathbf{x}^{13} + \alpha^3 \mathbf{x}^{14} \end{aligned} \tag{11.14}$$

Referring to Fig. 11.1, this polynomial is stored in the corresponding Register.

The parity check equation which gives the solution for coefficient *c*ˆ<sup>8</sup> is *h*3(*x*) = *h*0(*x*) *h*1(*x*) *h*2(*x*). Multiplying each of the corresponding coefficients together of the polynomials *h*0(*x*), *h*1(*x*) and *h*2(*x*) produces

$$\begin{aligned} h\_3(\mathbf{x}) &= \alpha^{12}\mathbf{x} + \alpha^{12}\mathbf{x}^2 + \alpha^{11}\mathbf{x}^3 + \alpha^3\mathbf{x}^4 + \alpha^{10}\mathbf{x}^5 \\ &+ \alpha^5\mathbf{x}^6 + \mathbf{x}^7 + \alpha^{10}\mathbf{x}^8 + \alpha^5\mathbf{x}^9 + \alpha^{11}\mathbf{x}^{11} \\ &+ \alpha^{14}\mathbf{x}^{12} + \mathbf{x}^{13} + \alpha^{14}\mathbf{x}^{14} \end{aligned} \tag{11.15}$$

Referring to Fig. 11.1, *h*3(*x*) will be input to Multiply and Sum. It should be noted that the parity check equation *h*3(*x*) has non-binary coefficients, even though the codeword is binary and the solution to the parity check equation has to be binary.

Evaluating the coefficient of *x* <sup>0</sup> of *h*3(*x*)*c*ˆ(*x*) gives α<sup>14</sup> + α<sup>14</sup> + α<sup>11</sup> + α<sup>5</sup> + ˆ*c*<sup>8</sup> + α<sup>5</sup> + α<sup>10</sup> + α<sup>3</sup> = 0, which simplifies to α<sup>11</sup> + ˆ*c*<sup>8</sup> + α<sup>10</sup> + α<sup>3</sup> = 0. Using Table 11.1 gives

$$(\alpha + \alpha^2 + \alpha^3) + \hat{c}\_8 + (1 + \alpha + \alpha^2) + \alpha^3 = 0$$

#### **Table 11.1** All 15 Nonzero Galois Field elements of GF(16)


and *c*ˆ<sup>8</sup> = 1. Referring to Fig. 11.1, Select produces from *h*3(*x*) the value of the coefficient of *x* <sup>7</sup> which is 1 and when inverted this is also equal to 1. The output of the Multiply and Add is 1, producing a product of 1, which is used by Update to update *c*ˆ<sup>8</sup> = 1 in the Input Vector *c*ˆ(*x*).

The parity check equation *h*2(*x*) gives the solution for coefficient *c*ˆ0. Evaluating the coefficient of *x* <sup>0</sup> of *h*2(*x*)*c*ˆ(*x*) gives

$$\begin{aligned} 0 &= \alpha^3 + \alpha^{11} + \alpha^{12} + \hat{c}\_5 \alpha^5 + \alpha^7 \\ &+ \alpha^9 + \alpha^{13} + \alpha^{10} + \alpha \end{aligned}$$

Substituting using Table 11.1 gives *c*ˆ5α<sup>5</sup> = 0 and *c*ˆ<sup>5</sup> = 0.

Lastly the parity check equation *h*0(*x*) gives the solution for coefficient *c*ˆ0. Evaluating the coefficient of *x* <sup>0</sup> of *h*0(*x*)*c*ˆ(*x*) gives

$$0 = \hat{c}\_0 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 \tag{11.16}$$

and it is found that *c*ˆ<sup>0</sup> = 0, and the updated *c*ˆ(*x*) with all three erasures solved is

$$\hat{c}(\mathbf{x}) = \mathbf{x} + \mathbf{x}^3 + \mathbf{x}^4 + \mathbf{x}^6 + \mathbf{x}^8 + \mathbf{x}^9 + \mathbf{x}^{10} + \mathbf{x}^{11} \tag{11.17}$$

equal to the original codeword.

# *11.5.3 Example of (128, 112, 17) Analogue BCH Code and Error-Correction of Audio Data (Music) Subjected to Impulsive Noise*

In this example, a stereo music file sampled at 44.1 kHz in complex Pulse Amplitude Modulation (PAM) format is split into sequences of 128 complex samples and encoded using an analogue (128, 112, 17) BCH code with α = *e j*2π <sup>128</sup> , and reassembled into a single PAM stream. A short section of the stereo left channel waveform before encoding is shown plotted in Fig. 11.2.

The encoding parity check matrix is the **Hf** matrix for bandlimited signals given above in matrix (11.6). There are 16 parity symbols and to make these obvious they are located at the beginning of each codeword. The same section of the stereo left channel waveform as before but after encoding is shown plotted in Fig. 11.3. The parity symbols are obvious as the newly introduced spikes in the waveform.

**Fig. 11.2** Section of music waveform prior to encoding

**Fig. 11.3** Section of music waveform after encoding

**Fig. 11.4** Section of music waveform after encoding and subjected to impulse noise

The parity symbols may be calculated for any combination of 16 coordinate positions and in a more complicated encoding arrangement the positions could be selected as those that produce the minimum mean square error. However, the frequency components affected extend from 19.47 to 22.1 kHz (these components are equal to zero after encoding) and are beyond the hearing range of most people.

The encoded music waveform is subjected to randomly distributed impulse noise with a uniformly distributed amplitude in the range ±16000. The result is shown plotted in Fig. 11.4 for the same section of the waveform as before, although this is not obvious in the plot.

The decoder strategy used is that in each received codeword the 16 received PAM samples with the greatest magnitudes exceeding a dynamic threshold or with largest change relative to neighbouring samples are erased. The erasures are then solved using the parity check equations as outlined above. In several cases, correctly received PAM samples are erased, but this does not matter provided the 112 nonerased samples in each received codeword are correct. The decoded music waveform is shown in Fig. 11.5, and is apparent that waveform after decoding is the same as the encoded waveform and the impulse noise errors have been corrected.

Usually, impulse noise effects are handled by noise suppressors which produce short, zero-valued waveform sections. These audible gaps are irritating to the listener. By using analogue BCH, error-correcting codes, there are no waveform gaps following decoding.

**Fig. 11.5** Section of music waveform after decoding

#### **11.6 Conclusions and Future Research**

It has been demonstrated that for analogue (*n*, *k*, *n* −*k* +1) BCH codes, parity check symbols having complex values may be calculated for any *n* − *k* arbitrary positions in each codeword and an efficient method of calculating erased symbols for any BCH code including binary codes has been presented. Bandlimited data naturally occurs in many sources of information. In effect the source data has already been encoded with an analogue BCH code. In practice the parity check equations of the BCH code will only approximately equal zero for the PAM samples of the bandlimited source. There is scope for determining those samples which require the minimum of changes in order to satisfy the parity check equations. Similarly in decoding codewords corrupted by a noisy channel there is the opportunity to use the statistics of the noise source to design a maximum likelihood decoder for analogue BCH codes. It appears likely that the extended Dorsch decoder described in Chap. 15 may be adapted for analogue BCH codes.

There are many ad hoc noise suppression algorithms used on analogue video and audio waveforms which cause artefacts in the signal processed outputs. There appears to be an opportunity to improve on these by using analogue BCH coding since the output of the decoder is always a codeword. For high quality systems this will predominantly be the transmitted codeword and therefore the decoder output will be free of artefacts.

Whilst most communications these days is digitally based, analogue communications is usually far more bandwidth efficient, particularly in wireless applications. By using analogue BCH codes, analogue communications may be attractive once more, particularly for niche applications.

Steganography is another area in which analogue BCH codes may be utilised. Errors in parity check equations may be used to communicate data in a side channel. By virtue of the parity check equations these errors may be distributed over multiple PAM samples or pixels. Secrecy may be assured by using a combination of secret permutations of the parity check matrix columns and a secret linear matrix transformation so that the parity check equations are unknown by anyone other than the originator.

#### **11.7 Summary**

Many information sources are naturally analogue and must be digitised if they are to be transmitted digitally. The process of digitisation introduces quantisation errors and increases the bandwidth required. The use of analogue error-correcting codes eliminates the need for digitisation. It been shown that analogue BCH codes may be constructed in the same way as finite field BCH codes, including Reed–Solomon codes. The difference is that the field of complex numbers is used instead of a prime field or prime power field. It has been shown how the Mattson–Solomon polynomial or equivalently the Discrete Fourier transform may be used as the basis for the construction of analogue codes. It has also been shown that a permuted parity check matrix produces an equivalent code using a primitive root of unity to construct the code as in discrete BCH codes.

A new algorithm was presented which uses symbolwise multiplication of rows of a parity check matrix to produce directly the parity check matrix in reduced echelon form. The algorithm may be used for constructing reduced echelon parity check matrices for standard BCH and RS codes as well as analogue BCH codes. Gaussian elimination or other means of solving parallel, simultaneous equations are completely avoided by the method. It was also proven that analogue BCH codes are Maximum Distance Separable (MDS) codes. Examples have been presented of using the analogue BCH codes in providing error-correction for analogue, band-limited data including the correction of impulse noise errors in BCH encoded, analogue stereo music waveforms. It is shown that since the data is bandlimited it is already redundant and the parity check symbols replace existing values so that there is no need for bandwidth expansion as in traditional error-correcting codes. Future research areas have been outlined including an analogue, maximum likelihood, error-correcting decoder based on the extended Dorsch decoder of Chap. 15. Steganography is another future application area for analogue BCH codes.

## **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the book's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the book's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Chapter 12 LDPC Codes**

## **12.1 Background and Notation**

LDPC codes are linear block codes whose parity-check matrix—as the name implies—is sparse. These codes can be iteratively decoded using the sum product [9] or equivalently the belief propagation [24] soft decision decoder. It has been shown, for example by Chung et al. [3], that for long block lengths, the performance of LDPC codes is close to the channel capacity. The theory of LDPC codes is related to a branch of mathematics called graph theory. Some basic definitions used in graph theory are briefly introduced as follows.

**Definition 12.1** (*Vertex, Edge, Adjacent and Incident*) A graph, denoted by *G*(*V*, *E*), consists of an ordered set of vertices and edges.


**Definition 12.2** (*Degree*) The degree of a vertex *v* ∈ *V*(*G*) is the number of edges that are incident with vertex *v*, i.e. the number of edges that are connected to vertex *v*.

**Definition 12.3** (*Bipartite or Tanner graph*) Bipartite or Tanner graph *G*(*V*, *E*) consists of two disjoint sets of vertices, say *Vv*(*G*) and *Vp*(*G*), such that *V*(*G*) = *Vv*(*G*) ∪ *Vp*(*G*), and every edge (*vi*, *pj*) ∈ *E*(*G*), such that *vi* ∈ *Vv*(*G*) and *pj* ∈ *Vp*(*G*) for some integers *i* and *j*.

315

An [*n*, *k*, *d*] LDPC code may be represented by a Tanner graph *G*(*V*, *E*). The parity-check matrix *H* of the LDPC code consists of |*Vp*(*G*)| = *n* − *k* rows and |*Vv*(*G*)| = *n* columns. The set of vertices *Vv*(*G*) and *Vp*(*G*) are called *variable* and *parity-check* vertices, respectively. Figure 12.1 shows the parity check and the cor-

**Fig. 12.1** Representations of a [16, 4, 4] LDPC code

responding Tanner graph of a [16, 4, 4] LDPC code. Let *Vv*(*G*) = (*v*0, *v*1,..., *vn*−1) and *Vp*(*G*) = (*p*0, *p*1,..., *pn*−*k*−1); we can see that for each (*vi*, *pj*) ∈ *E*(*G*), the *i*th column and *j*th row of *H*, *Hj*,*<sup>i</sup>* = 0, for 0 ≤ *i* ≤ *n* − 1 and 0 ≤ *j* ≤ *n* − *k* − 1.

**Definition 12.4** (*Cycle*) A cycle in a graph *G*(*V*, *E*)is a sequence of distinct vertices that starts and ends in the same vertex. For bipartite graph *G*(*V*, *E*), exactly half of these distinct vertices belong to *Vv*(*G*) and the remaining half belong to *Vp*(*G*).

**Definition 12.5** (*Girth and Local Girth*) The girth of graph *G*(*V*, *E*) is the length of the shortest cycle in the graph *G*(*V*, *E*). The local girth of a vertex *v* ∈ *V*(*G*) is the length of shortest cycle that passes through vertex *v*.

The performance of a typical iteratively decodable code (e.g. an LDPC or turbo code) may be partitioned into three regions, namely erroneous, waterfall and error floor regions, see Fig. 12.2. The erroneous region occurs at low *Eb*/*N*<sup>0</sup> values and is indicated by the inability of the iterative decoder to correctly decode almost all of the transmitted messages. As we increase the signal power, the error rate of the iterative decoder decreases rapidly—resembling a waterfall. The *Eb*/*N*<sup>0</sup> value at which the waterfall region starts is commonly known as the *convergence threshold* in the literature. At higher *Eb*/*N*<sup>0</sup> values, the error rate starts to flatten—introducing an error floor in the frame error rate (FER) curve.

In addition to this FER curve, the offset sphere packing lower bound and the probability of error based on the union bound argument as described in Chap. 1 are also plotted in Fig. 12.2. The sphere packing lower bound represents the region of

**Fig. 12.2** Waterfall and error regions of a typical LDPC code for the AWGN channel

attainable performance of a coding system. The performance to the left of this lower bound is not attainable, whereas that to the right may be achieved by some coding and decoding arrangements. The other curve is the union bound of the probability of error, which is dominated by the low Hamming weight codewords and the number of codewords of these Hamming weights. The larger the minimum Hamming distance of a code, the lower the union bound typically. For iteratively decodable codes which are not designed to maximise the minimum Hamming distance, the union bound intersects with the offset sphere packing lower bound at relatively low *Eb*/*N*<sup>0</sup> values.

It may be seen that, with an ideal soft decision decoder, the performance of a coding system would follow the sphere packing lower bound and at higher *Eb*/*N*<sup>0</sup> values, the performance floors due to the limitation of the minimum Hamming weight codewords. However, as depicted in Fig. 12.2, there is a relatively wide gap between the union bound and the error floor of a typical iteratively decodable code. This is an inherent behaviour of iteratively decodable codes and it is attributed to the weakness of the iterative decoder. There are other error events, which are not caused by the minimum Hamming weight codewords, that prevent the iterative decoder from reaching the union bound.

In terms of the construction technique, we may divide LDPC codes into two categories: random and algebraic LDPC codes. We may also classify LDPC codes into two categories depending on the structure of the parity-check matrix, namely regular and irregular codes—refer to Sect. 12.1.1 for the definition. Another attractive construction method that has been shown to offer capacity-achieving performance is non-binary construction.

#### *12.1.1 Random Constructions*

Gallager [8] introduced the (*n*, λ, ρ) LDPC codes where *n* represents the block length whilst the number of non-zeros per column and the number of non-zeros per row are represented by λ and ρ, respectively.

The short notation (λ, ρ) is also commonly used to represent these LDPC codes. The coderate of the Gallager (λ, ρ) codes is given by

$$\mathcal{R} = 1 - \frac{\lambda}{\rho}.$$

An example of the parity-check matrix of a Gallager (λ, ρ) LDPC code is shown in Fig. 12.1a. It is a [16, 4, 4] code with a λ of 3 and a ρ of 4. The parity-check matrix of the (λ, ρ) Gallager codes always have a fixed number of non-zeros per column and per row, and because of this property, this class of LDPC codes is termed regular LDPC codes. The performance of the Gallager LDPC codes in the waterfall region is not as satisfactory as that of turbo codes for the same block length and code rate. Many efforts have been devoted to improve the performance of the LDPC codes and one example that provides significant improvement is the introduction of the irregular LDPC codes by Luby et al. [18]. The irregular LDPC codes, as the name implies, do not have a fixed number of non-zeros per column or per row and thus the level of error protection varies over a codeword. The columns of a parity check matrix that have a higher number of non-zeros provide stronger error protection than those that have a lower number of non-zeros. Given an input block in iterative decoding, errors in the coordinates of this block, whose columns of the parity-check matrix have a larger number of non-zeros, will be corrected earlier, i.e. only a small number of iterations are required. In the subsequent iterations, the corrected values in these coordinates will then be utilised to correct errors in the remaining coordinates of the block.

**Definition 12.6** (*Degree Sequences*) The polynomial Λλ(*x*) = - *<sup>i</sup>*≥<sup>1</sup> <sup>λ</sup>*<sup>i</sup> <sup>x</sup><sup>i</sup>* is called the symbol or variable degree sequence, where λ*<sup>i</sup>* is the fraction of vertices of degree *i*. Similarly, Λρ(*x*) = - *<sup>i</sup>*≥<sup>1</sup> <sup>ρ</sup>*<sup>i</sup> <sup>x</sup><sup>i</sup>* is the check degree sequence, where <sup>ρ</sup>*<sup>i</sup>* is the fraction of vertices of degree *i*.

The degree sequences given in the above definition are usually known as *vertexoriented degree sequences*. Another representations are *edge-oriented degree sequences* which consider the fraction of edges that are connected to a vertex of certain degree. Irregular LDPC codes are defined by these degree sequences and it is assumed that the degree sequences are vertex-oriented.

*Example 12.1* An irregular LDPC code with the following degree sequences

$$\begin{aligned} A\_{\lambda}(\mathbf{x}) &= 0.5\mathbf{x}^2 + 0.26\mathbf{x}^3 + 0.17\mathbf{x}^5 + 0.07\mathbf{x}^{10} \\ A\_{\rho}(\mathbf{x}) &= 0.80\mathbf{x}^{14} + 0.20\mathbf{x}^{15} \end{aligned}$$

has 50, 26, 17 and 7% of the columns with 2, 3, 5 and 10 ones per column, respectively, and 80 and 20% of the rows with 14 and 15 ones per row, respectively.

Various techniques have been proposed to design good degree distributions. Richardson et al. [27] used *density evolution* to determine the convergence threshold and to optimise the degree distributions. Chung et al. [4] simplified the density evolution approach using *Gaussian approximation*. With the optimised degree distributions, Chung et al. [3] showed that the bit error rate performance of a long block length (*n* = 107) irregular LDPC code was within 0.04 dB away from the capacity limit for binary transmission over the AWGN channel, discussed in Chap. 1. This is within 0.18 dB of Shannon's limit [30]. The density evolution and Gaussian approximation methods, which make use of the concentration theorem [28], can only be used to design the degree distributions for infinitely long LDPC codes. The concentration theorem states that the performance of cycle-free LDPC codes can be characterised by the average performance of the ensemble. The cycle-free assumption is only valid for infinitely long LDPC codes and cycles are inevitable for finite block-length LDPC codes. As may be expected, the performance of finite block-length LDPC codes with degree distributions derived based on the concentration theorem differs considerably from the ensemble performance. There are various techniques to design good finite block-length LDPC codes, for instance see [1, 2, 10, 33]. In particular, the work of Hu et al. [10] with the introduction of the progressive edge-growth (PEG) algorithm to construct both regular and irregular LDPC codes, that of Tian et al. [33] with the introduction of *extrinsic message degree* and recently, that of Richter et al. [29] which improves the original PEG algorithm by introducing some construction constraints to avoid certain cycles involving variable vertices of degree 3, have provided significant contributions to the construction of practical LDPC codes as well as the lowering of the inherent error floor of these codes.

## *12.1.2 Algebraic Constructions*

In general, LDPC codes constructed algebraically have a regular structure in their parity-check matrix. The algebraic LDPC codes offer many advantages over randomly generated codes. Some of these advantages are


One of the earliest algebraic LDPC code constructions was introduced by Margulis [21] using the Ramanujan graphs. Lucas et al. [19] showed that the well-known different set cyclic (DSC) [36] and one-step majority-logic decodable (OSMLD) [17] codes have good performance under iterative decoding. The iterative soft decision decoder offers significant improvement over the conventional hard decision majority-logic decoder. Another class of algebraic codes is the class of the Euclidean and projective geometry codes which are discussed in detail by Kou et al. [16]. Other algebraic constructions include those that use combinatorial techniques [13–15, 35].

It has been observed that in general, there is an inverse performance relationship between the minimum Hamming distance of the code and the convergence of the iterative decoder. Irregular codes converge well with iterative decoding, but the minimum Hamming distance is relatively poor. In contrast, algebraically constructed LDPC codes, which have high minimum Hamming distance, tend not to converge well with iterative decoding. Consequently, compared to the performance of irregular codes, algebraic LDPC codes may perform worse in the low SNR region and perform better in the high SNR region. This is attributed to the early error floor of the irregular codes. As will be shown later, for short block lengths (*n* < 350), cyclic algebraic LDPC codes offer some of the best performance available.

#### *12.1.3 Non-binary Constructions*

LDPC codes may be easily extended so that the symbols take values from the finitefield F2*<sup>m</sup>* and Davey et al. [6] were the pioneers in this area. Given an LDPC code over <sup>F</sup><sup>2</sup> with parity-check matrix *<sup>H</sup>*, we may construct an LDPC code over <sup>F</sup>2*<sup>m</sup>* for *<sup>m</sup>* <sup>≥</sup> <sup>2</sup> by simply replacing every non-zero element of *H* with any non-zero element of F2*<sup>m</sup>* in a random or structured manner. Davey et al. [6] and Hu et al. [11] have shown that the performance of LDPC codes can be improved by going beyond the binary field. The non-binary LDPC codes have better convergence behaviour under iterative decoding. Using some irregular non-binary LDPC codes, whose parity-check matrices are derived by randomly replacing the non-zeros of the PEG-constructed irregular binary LDPC codes, Hu et al. [11] demonstrated that an additional coding gain of 0.25 dB was achieved. It may be regarded that the improved performance is attributable to the improved graph structure in the non-binary arrangement. Consider a cycle of length 6 in the Tanner graph of a binary LDPC code, which is represented as the following sequence of pairs of edges {(*v*0, *p*0), (*v*3, *p*0), (*v*3, *p*2), (*v*4, *p*2), (*v*4, *p*1), (*v*0, *p*1)}. If we replace the corresponding entries in the parity-check matrix with some nonzeros over <sup>F</sup>2*<sup>m</sup>* for *<sup>m</sup>* <sup>≥</sup> 2, provided that these six entries are not all the same, the cycle length becomes longer than 6. According to McEliece et al. [22] and Etzion et al. [7], the non-convergence of the iterative decoder is caused by the existence of cycles in the Tanner graph representation of the code. Cycles, especially those of short lengths, introduce correlations of reliability information exchanged in iterative decoding. Since cycles are inevitable for finite block length codes, it is desirable to have LDPC codes with large girth.

The non-binary LDPC codes also offer an attractive matching for higher order modulation methods. The impact of increased complexity of the symbol-based iterative decoder can be moderated as the reliability information from the component codes may be efficiently evaluated using the frequency-domain dual-code decoder based on the Fast Walsh–Hadamard transform [6].

## **12.2 Algebraic LDPC Codes**

Based on idempotents and cyclotomic cosets, see Chap. 4, a class of cyclic codes that is suitable for iterative decoding may be constructed. This class of cyclic codes falls into the class of one-step majority-logic decodable (OSMLD) codes whose paritycheck polynomial is orthogonal on each bit position—implying the absence of a girth of 4 in the underlying Tanner graph, and the corresponding parity-check matrix is sparse, and thus can be used as LDPC codes.

**Definition 12.7** (*Binary Parity-Check Idempotent*) Let *M* ⊆ *N* and let the polynomial *u*(*x*) ∈ *T* (*x*) be defined by

$$\mu(\mathbf{x}) = \sum\_{\mathbf{s} \in \mathcal{A}} e\_{\mathbf{s}}(\mathbf{x}) \tag{12.1}$$

where *es*(*x*) is an idempotent. The polynomial *u*(*x*) is called a binary parity-check idempotent.

The binary parity-check idempotent *u*(*x*) can be used to describe an [*n*, *k*] cyclic code as discussed in Chap. 4. Since GCD(*u*(*x*), *x <sup>n</sup>* − 1) = *h*(*x*), the polynomial *u*¯(*x*) = *x* deg(*u*(*x*))*u*(*x*−<sup>1</sup>) and its *n* cyclic shifts (mod *x <sup>n</sup>* − 1) can be used to define the parity-check matrix of a binary cyclic code. In general, wt*<sup>H</sup>* (*u*¯(*x*)) is much lower than wt*<sup>H</sup>* (*h*(*x*)), and therefore a low-density parity-check matrix can be derived from *u*¯(*x*).

Let the parity-check polynomial *u*¯(*x*) = *x <sup>u</sup>*¯<sup>0</sup> + *x <sup>u</sup>*¯<sup>1</sup> +···+ *x <sup>u</sup>*¯*<sup>t</sup>* of weight *t* + 1. Since the code defined by *u*¯(*x*) is cyclic, for each non-zero coefficient *u*¯*<sup>i</sup>* in *u*¯(*x*), there are another *t* parity-check polynomials of weight *t* + 1, which also have a non-zero coefficient at position *u*¯*<sup>i</sup>* . Furthermore, consider the set of these *t* + 1 polynomials that have a non-zero coefficient at position *u*¯*<sup>i</sup>* , there is no more than one polynomial in the set that have a non-zero at position *u*¯ *<sup>j</sup>* for some integer *j*. In other words, if we count the number of times the positions 0, 1,..., *n* − 1 appear in the exponents of the aforementioned set of *t* + 1 polynomials, we shall find that all positions except *u*¯*<sup>i</sup>* appear at most once. This set of *t* + 1 polynomials is said to be *orthogonal* on position *u*¯*<sup>i</sup>* . The mathematical expression of this orthogonality is given in the following definition and lemma.

**Definition 12.8** (*Difference Enumerator Polynomial*) Let the polynomial *f* (*x*) ∈ *T* (*x*). The difference enumerator of *f* (*x*), denoted as *D*( *f* (*x*)), is defined as

$$\mathcal{O}(f(\mathbf{x})) = f\left(\mathbf{x}\right)f\left(\mathbf{x}^{-1}\right) = d\_0 + d\_1\mathbf{x} + \dots + d\_{n-1}\mathbf{x}^{n-1},\tag{12.2}$$

where it is assumed that *D*( *f* (*x*)) is a modulo *x <sup>n</sup>* − 1 polynomial with coefficients taking values from R (real coefficients).

**Lemma 12.1** *Let di for* 0 ≤ *i* ≤ *n* − 1 *denote the coefficients of D*(*u*¯(*x*))*. If di* ∈ {0, 1}*, for all i* ∈ {1, 2,..., *n* − 1}*, the parity-check polynomial derived from u*¯(*x*) *is orthogonal on each position in the n-tuple. Consequently,*

*(i) the minimum distance of the resulting LDPC code is* 1 + wt*<sup>H</sup>* (*u*¯(*x*))*, and*

*(ii) the underlying Tanner Graph has girth of at least* 6*.*

*Proof* (i) [25, Theorem 10.1] Let a codeword *c*(*x*) = *c*<sup>0</sup> + *c*1*x* +···+ *cn*−1*x <sup>n</sup>*−<sup>1</sup> and *c*(*x*) ∈ *T* (*x*). For each non-zero bit position *c <sup>j</sup>* of *c*(*x*), where *j* ∈ {0, 1,..., *n* − 1}, there are wt*<sup>H</sup>* (*u*(*x*)) parity-check equations orthogonal to position *c <sup>j</sup>* . Each of the parity-check equations must check another non-zero bit *cl* , where *l* = *j*, so that the equation is satisfied. Clearly, wt*<sup>H</sup>* (*c*(*x*)) must equal to 1 + wt*<sup>H</sup>* (*u*(*x*)) and this is the minimum weight of all codewords.

(ii) The direct consequence of having orthogonal parity-check equations is the absence of cycles of length 4 in the Tanner Graphs. Let *a*, *b* and *c*, where *a* < *b* < *c*, be three distinct coordinates in an *n*-tuple, since *di* ∈ {0, 1} for 1 ≤ *i* ≤ *n* − 1, this implies that *b* − *a* = *c* − *b*. We know that *q*(*b* − *a*) (mod *n*) ∈ {1, 2,..., *n* − 1} and thus, *q*(*b* − *a*) (mod *n*) ≡ (*c* − *b*) for some integer *q* ∈ {1, 2,..., *n* − 1}. If we associate the integers *a*, *b* and *c* with some variable vertices in the Tanner graph, a cycle of length 6 is produced.

It can be deduced that the cyclic LDPC code with parity-check polynomial *u*¯(*x*) is an OSMLD code if *di* ∈ {0, 1}, for all *i* ∈ {1, 2,..., *n* − 1} or a difference set cyclic (DSC) code if *di* = 1, for all *i* ∈ {1, 2,..., *n* − 1}, where *di* is the coefficient of *D*(*u*¯(*x*)).

In order to arrive at either OSMLD or DSC codes, the following design conditions are imposed on *u*¯(*x*) and therefore, *u*(*x*):

**Condition 12.1** The idempotent *u*(*x*) must be chosen such that

$$(\text{wt}\_H(\mu(\mathfrak{x})) \left(\text{wt}\_H(\mu(\mathfrak{x})) - 1\right)) \le n - 1.$$

*Proof* There are wt*<sup>H</sup>* (*u*(*x*)) polynomials of weight wt*<sup>H</sup>* (*u*(*x*)) that are orthogonal on position *j* for some integer *j*. The number of distinct positions in this set of polynomials is wt*<sup>H</sup>* (*u*(*x*))(wt*<sup>H</sup>* (*u*(*x*)) − 1), and this number must be less than or equal to the total number of distinct integers between 1 and *n* − 1.

**Condition 12.2** Following Definition 12.8, let *W* = {*i* | *di* = 1, 1 ≤ *i* ≤ *n* − 1}, the cardinality of *W* must be equal to wt*<sup>H</sup>* (*u*(*x*))(wt*<sup>H</sup>* (*u*(*x*)) − 1).

*Proof* The cyclic differences between the exponents of polynomial *u*(*x*) are given by *D* (*u*(*x*)) = *n*−1 *<sup>i</sup>*=<sup>0</sup> *di <sup>x</sup><sup>i</sup>* , where the coefficient *di* is the number of differences and the exponent *i* is the difference. The polynomial *u*(*x*) and some of its cyclic shifts are orthogonal on position 0 and this means that all of the cyclic differences between the exponents of *u*(*x*) (excluding zero) must be distinct, i.e. *di* ∈ {0, 1} for 1 ≤ *i* = *n* − 1. Since the weight of *u*(*x*) excluding *x* <sup>0</sup> is wt*<sup>H</sup>* (*u*(*x*)) − 1 and there are wt*<sup>H</sup>* (*u*(*x*)) cyclic shifts of *u*(*x*) that are orthogonal to *x* 0, the number of distinct exponents in the cyclic differences is wt*<sup>H</sup>* (*u*(*x*)) (wt*<sup>H</sup>* (*u*(*x*)) − 1) = *W*.

**Condition 12.3** The exponents of *u*(*x*) must not contain a common factor of *n*, otherwise a degenerate code, a repetition of a shorter cyclic code, is the result.

*Proof* If the exponents of *u*(*x*) contain a common factor of *n*, *p* with *n* = *pr*, then factors of *u*(*x*) divide *x<sup>r</sup>* − 1 and form a cyclic code of length *r*. Every codeword of the longer code is a repetition of the shorter cyclic code.

**Condition 12.4** Following (12.1), unless wt*<sup>H</sup>* (*es*(*x*)) = 2, the binary parity-check idempotent *es*(*x*) must not be self-reciprocal, i.e. *es*(*x*) = *ei x*−<sup>1</sup> , for all *i* ∈ *M*.

*Proof* The number of non-zero coefficients of *D*(*es*(*x*)) is equal to

wt*<sup>H</sup>* (*es*(*x*))(wt*<sup>H</sup>* (*es*(*x*)) − 1).

For a self-reciprocal case, *es*(*x*)*es x*−<sup>1</sup> = *e*<sup>2</sup> *<sup>s</sup>* (*x*) = *es*(*x*) with wt*<sup>H</sup>* (*es*(*x*)) non-zero coefficients. Following Condition 12.1, the inequality

$$(\operatorname{wt}\_H(e\_s(\mathfrak{x})) \, (\operatorname{wt}\_H(e\_s(\mathfrak{x}) - 1) \le \operatorname{wt}\_H(e\_s(\mathfrak{x}))))$$

becomes equality if and only if wt*<sup>H</sup>* (*es*(*x*)) = 2.

**Condition 12.5** Following (12.1), *u*(*x*) must not contain *es x*−<sup>1</sup> , for all *i* ∈ *M*, unless *es*(*x*) is self-reciprocal.

*Proof* If *u*(*x*) contains *es x*−<sup>1</sup> for *i* ∈ *M*, then *D*(*u*(*x*)) will contain both *es*(*x*)*es x*−<sup>1</sup> and *es x*−<sup>1</sup> *es*(*x*), hence, some of the coefficients of *D*(*es*(*x*)), *di* = {0, 1} for some integer *i*.

Although the above conditions seem overly restrictive, they turn out to be helpful in code construction. Codes may be designed in stage-by-stage by adding candidate idempotents to *u*(*x*), checking the above conditions at each stage.

In order to encode the cyclic LDPC codes constructed, there is no need to determine *g*(*x*). With α defined as a primitive *n*th root of unity, it follows from Lemma 4.4 that *u*(α*<sup>i</sup>* ) ∈ {0, 1} for 0 ≤ *i* ≤ *n* − 1. Let *J* = (*j*0, *j*1,..., *jn*−*k*−<sup>1</sup>) be a set of integers between 0 and *n* − 1, such that *g*(α *<sup>j</sup>* ) = 0, for all *j* ∈ *J* . Because *u*(*x*) does not contain α *<sup>j</sup>* as its roots, it follows that *u*(α *<sup>j</sup>* ) <sup>=</sup> 1, for all *<sup>j</sup>* <sup>∈</sup> *<sup>J</sup>* . In <sup>F</sup>2, 1 + *u*(α *<sup>j</sup>* ) = 0 and the polynomial 1 + *u*(*x*) = *eg*(*x*), the generating idempotent of the code may be used to generate the codewords as an alternative to *g*(*x*).

The number of information symbols of the cyclic LDPC codes can be determined either from the number of roots of *u*(*x*) which are also roots of unity, i.e. *n* − wt*<sup>H</sup>* (*U*(*z*)), or from the degree of (*u*(*x*), *x <sup>n</sup>* − 1) = *h*(*x*).

*Example 12.2* Consider the design of a cyclic LDPC code of length 63. The cyclotomic coset modulo 63 is given in Example 4.2. Let *u*(*x*) be defined by *C*9, i.e. *u*(*x*) = *e*9(*x*) = *x* <sup>9</sup>(1 + *x* <sup>9</sup> + *x* <sup>27</sup>). *D*(*u*¯(*x*)) indicates that the parity-check matrix defined by *u*¯(*x*) has no cycles of length 4; however, following Condition 12.3, it is a degenerate code consisting of repetitions of codewords of length 7.

With *u*(*x*) = *e*23(*x*) = *x* <sup>23</sup>(1 + *x* <sup>6</sup> + *x* <sup>20</sup> + *x* <sup>23</sup> + *x* <sup>30</sup> + *x* <sup>35</sup>), the resulting cyclic code is a [63, 31, 6] LDPC code which is non-degenerate and its underlying Tanner graph has girth of 6. This code can be further improved by adding *e*21(*x*) to *u*(*x*). Despite *e*21(*x*) being self-reciprocal, its weight is 2 satisfying Condition 12.4. Now, *u*(*x*) = *x* <sup>21</sup>(1 + *x* <sup>2</sup> + *x* <sup>8</sup> + *x* <sup>21</sup> + *x* <sup>22</sup> + *x* <sup>25</sup> + *x* <sup>32</sup> + *x* <sup>37</sup>), and it is a [63, 37, 9] cyclic LDPC code.

Based on the theory described above, an algorithm which exhaustively searches for all non-degenerate cyclic LDPC codes of length *n* which have orthogonal paritycheck polynomials has been developed, and it is given in Algorithm 12.1.

#### **Algorithm 12.1** CodeSearch(**V**, *index*)

**Input:** *index* ⇐ an integer that is initialised to −1 **V** ⇐ a vector that is initialised to ∅ *S* ⇐ *N* excluding 0 **Output: CodesList** contains set of cyclic codes which have orthogonal parity-check polynomial 1: **T** ⇐ **V** 2: **for** *i*=*index*+1; *i* ≤ |*<sup>S</sup>* |; *i*++ **do** 3: **T**prev ⇐ **T** 4: **if** - <sup>∀</sup>*t*∈**<sup>T</sup>** <sup>|</sup>*CSt* | ≤ <sup>√</sup>*n*, *St* is the *<sup>t</sup>*th element of *<sup>S</sup>* **then** 5: Append *i* to **T** 6: *u*(*x*) = - <sup>∀</sup>*t*∈**<sup>T</sup>** *eSt*(*x*) 7: **if** *u*(*x*) is non-degenerate **and** *u*(*x*) is orthogonal on each position (Lemma 12.1) **then** 8: *U*(*z*) = MS (*u*(*x*)) 9: *k* = *n* − wt*<sup>H</sup>* (*U*(*z*)) 10: *C* ⇐ a [*n*, *k*, 1 + wt*<sup>H</sup>* (*u*(*x*))] cyclic code defined by *u*(*x*) 11: **if** *<sup>k</sup>* <sup>≥</sup> <sup>1</sup> 4 **and** *<sup>C</sup>* ∈/ **CodeList then** 12: Add **C** to **CodeList** 13: **end if** 14: **end if** 15: CodeSearch(**T**, *i*) 16: **end if** 17: **T** ⇐ **T**prev 18: **end for**

Table 12.1 lists some example of codes obtained from Algorithm 12.1. Note that all codes with code rate less than 0.25 are excluded from the table and codes of longer lengths may also be constructed. We can also see that some of the codes in Table 12.1 have the same parameters as the Euclidean and projective geometry codes, which have been shown by Jin et al. [16] to perform well under iterative decoding.


**Table 12.1** Examples of 2-cyclotomic coset-based LDPC codes

A key feature of the cyclotomic coset-based construction is the ability to increment the minimum Hamming distance of a code by adding further weight from other idempotents and so steadily decrease the sparseness of the resulting parity-check matrix. Despite the construction method producing LDPC codes with no cycles of length 4, it is important to remark that codes that have cycles of length 4 in their paritycheck matrices do not necessary have bad performance under iterative decoding, and a similar finding has been demonstrated by Tang et al. [31]. It has been observed that there are many cyclotomic coset-based LDPC codes that have this property, and the constraints in Algorithm 12.1 can be easily relaxed to allow the construction of cyclic LDPC codes with girth 4.

# *12.2.1 Mattson–Solomon Domain Construction of Binary Cyclic LDPC Codes*

The [*n*, *k*, *d*] cyclic LDPC codes presented in Sect. 4.4 are constructed using the sum of idempotents, which are derived from the cyclotomic cosets modulo *n*, to define the parity-check matrix. A different insight into this construction technique may be obtained by working in the Mattson–Solomon domain.

Let *<sup>n</sup>* be a positive odd integer, <sup>F</sup>2*<sup>m</sup>* be a splitting field for *<sup>x</sup> <sup>n</sup>* <sup>−</sup> 1 over <sup>F</sup>2, <sup>α</sup> be a generator for <sup>F</sup>2*<sup>m</sup>* , and *Tm*(*x*) be a polynomial with maximum degree of *<sup>n</sup>* <sup>−</sup> 1 and coefficients in F2*<sup>m</sup>* . Similar to Sect. 4.4, the notation of *T* (*x*)is used as an alternative to *T*1(*x*) and the variables *x* and *z* are used to distinguish the polynomials in the domain and codomain. Let the decomposition of *<sup>z</sup><sup>n</sup>* <sup>−</sup>1 into irreducible polynomials over <sup>F</sup><sup>2</sup> be contained in a set *F* = { *f*1(*z*), *f*2(*z*), . . . , *ft*(*z*)}, i.e. <sup>1</sup>≤*i*≤*<sup>t</sup> fi*(*z*) <sup>=</sup> *<sup>z</sup><sup>n</sup>* <sup>−</sup> 1. For each *fi*(*z*), there is a corresponding primitive idempotent, denoted as θ*i*(*z*), which can be obtained by [20]

$$\theta\_i(z) = \frac{z(z^n - 1)f\_i'(z)}{f\_i(z)} + \delta(z^n - 1) \tag{12.3}$$

where *f <sup>i</sup>* (*z*) <sup>=</sup> *<sup>d</sup> dz fi*(*z*), *f <sup>i</sup>* (*z*) ∈ *T* (*z*) and the integer δ is defined by

$$
\delta = \begin{cases} 1 & \text{if } \text{deg}(f\_i(z)) \text{ is odd,} \\ 0 & \text{otherwise.} \end{cases}
$$

Let the decomposition of *z<sup>n</sup>* −1 and its corresponding primitive idempotent be listed as follows:

$$
\begin{array}{ccc}
\mu\_1(\mathbf{x}) & \theta\_1(\mathbf{z}) & f\_1(\mathbf{z}) \\
\mu\_2(\mathbf{x}) & \theta\_2(\mathbf{z}) & f\_2(\mathbf{z}) \\
\vdots & \vdots & \vdots \\
\mu\_t(\mathbf{x}) & \theta\_t(\mathbf{z}) & f\_t(\mathbf{z}).
\end{array}
$$

Here *u*1(*x*), *u*2(*x*), . . . , *ut*(*x*) are the binary idempotents whose Mattson–Solomon polynomials are θ1(*z*), θ2(*z*), . . . , θ*t*(*z*), respectively. Assume that *I* ⊆ {1, 2,..., *t*}, let the binary polynomials *u*(*x*) = - <sup>∀</sup>*i*∈*<sup>I</sup> ui*(*x*), *<sup>f</sup>* (*z*) <sup>=</sup> <sup>∀</sup>*i*∈*<sup>I</sup> fi*(*z*), and θ (*z*) = - <sup>∀</sup>*i*∈*<sup>I</sup>* <sup>θ</sup>*i*(*z*). It is apparent that, since *ui*(*x*) <sup>=</sup> MS−<sup>1</sup> (θ*i*(*z*)), *<sup>u</sup>*(*x*) <sup>=</sup> MS−<sup>1</sup> (θ (*z*)) and *u*(*x*) is an idempotent.<sup>1</sup>

Recall that *u*(*x*) is a low-weight binary idempotent whose reciprocal polynomial can be used to define the parity-check matrix of a cyclic LDPC code. The number of distinct *n*th roots of unity which are also roots of the idempotent *u*(*x*) determines the dimension of the resulting LDPC code. In this section, the design of cyclic LDPC codes is based on several important features of a code. These features, which are listed as follows, may be easily gleaned from the Mattson–Solomon polynomial of *u*(*x*) and the binary irreducible factors of *z<sup>n</sup>* − 1.

#### 1. **Weight of the idempotent** *u*(*x*)

The weight of *u*(*x*) is the number of *n*th roots of unity which are zeros of *f* (*z*). Note that, *f* (α*<sup>i</sup>* ) = 0 if and only if θ (α*<sup>i</sup>* ) = 1 since an idempotent takes only the values 0 and 1 over <sup>F</sup>2*<sup>m</sup>* . If *<sup>u</sup>*(*x*) is written as *<sup>u</sup>*<sup>0</sup> <sup>+</sup>*ui <sup>x</sup>* +···+*un*−1*<sup>x</sup> <sup>n</sup>*−1, following (11.2), we have

$$
\mu\_i = \theta(\alpha^i) \pmod{2} \qquad \text{for } i = \{0, 1, \dots, n-1\}.
$$

Therefore, *ui* = 1 precisely when *f* (α*<sup>i</sup>* ) = 0, giving wt*<sup>H</sup>* (*u*(*x*)) as the degree of the polynomial *f* (*z*).

#### 2. **Number of zeros of** *u*(*x*)

Following (11.1), it is apparent that the number of zeros of *u*(*x*) which are roots of unity, which is also the dimension of the code *k*, is

$$\text{Number of zeros of } u(\mathbf{x}) = k = n - \text{wt}\_H \left( \theta(\boldsymbol{\varepsilon}) \right). \tag{12.4}$$

#### 3. **Minimum Hamming distance bound**

The lower bound of the minimum Hamming distance of a cyclic code, defined by idempotent *u*(*x*), is given by its BCH bound, which is determined by the number of consecutive powers of α, taken cyclically (mod *n*), which are roots of the generating idempotent *eg*(*x*) = 1 + *u*(*x*). In the context of *u*(*x*), it is the same as the number of consecutive powers of α, taken cyclically (mod *n*), which are not roots of *u*(*x*). Therefore, it is the largest number of consecutive non-zero coefficients in θ (*z*), taken cyclically (mod *n*).

The method of finding *fi*(*z*) is well established and using the above information, a systematic search for idempotents of suitable weight may be developed. To be efficient, the search procedure has to start with an increasing order of wt*<sup>H</sup>* (*u*(*x*)) and this requires rearrangement of the set *F* such that deg( *fi*(*z*)) < deg( *fi i* + 1(*z*)) for all *i*. It is worth mentioning that it is not necessary to evaluate *u*(*x*) by taking the

<sup>1</sup>Since the Mattson–Solomon polynomial of a binary polynomial is an idempotent and vice-versa [20], the Mattson–Solomon polynomial of a binary idempotent is also a binary idempotent.

Mattson–Solomon polynomial of θ (*z*), for each *f* (*z*) obtained. It is more efficient to obtain *u*(*x*) once the desired code criteria, listed above, are met.

For an exhaustive search, the complexity is of order *O* 2|*F*<sup>|</sup> . A search algorithm, see Algorithm 12.2, has been developed and it reduces the complexity considerably by targeting the search on the following key parameters. Note that this search algorithm, which is constructed in the Mattson–Solomon domain, is not constrained to find cyclic codes that have girth at least 6.

#### 1. **Sparseness of the parity-check matrix**

A necessary condition for the absence of cycles of length 4 is given by the inequality wt*<sup>H</sup>* (*u*(*x*))(wt*<sup>H</sup>* (*u*(*x*)) − 1) ≤ *n* − 1. Since wt*<sup>H</sup>* (*u*(*x*)) = deg( *f* (*z*)), a reasonable bound is

$$\sum\_{\forall i \in \mathcal{J}} \deg(f\_i(z)) \le \sqrt{n}.$$

In practise, this limit is extended to enable the finding of good cyclic LDPC codes which have girth of 4 in their underlying Tanner graph.

#### 2. **Code rate**

The code rate is directly proportional to the number of roots of *u*(*x*). If *Rmin* represents the minimum desired code rate, then it follows from (12.4) that we can refine the search to consider the cases where

$$\text{wt}\_H(\theta(\mathcal{z})) \le (1 - R\_{\text{min}})n\text{ .}$$

#### 3. **Minimum Hamming distance**

If the idempotent *u*(*x*) is orthogonal on each position, then the minimum Hamming distance of the resulting code defined by *u*(*x*) is equal to 1 + wt*<sup>H</sup>* (*u*(*x*)). However, for cyclic codes with cycles of length 4, there is no direct method to determine their minimum Hamming distance and the BCH bound provides a lower bound to the minimum Hamming distance. Let *d* be the lowest desired minimum Hamming distance and *r*<sup>θ</sup> be the largest number of consecutive nonzero coefficients, taken cyclically, of θ (*z*). If a cyclic code has *r*<sup>θ</sup> of *d*, then its minimum Hamming distance is at least 1+*d*. It follows that we can further refine the search with the constraint

$$r\_{\theta} \ge d - 1.$$

In comparison to the construction method described in Sect. 4.4, we can see that the construction given in Sect. 4.4 starts from the idempotent *u*(*x*), whereas this section starts from the idempotent θ (*z*), which is the Mattson–Solomon polynomial of *u*(*x*). Both construction methods are equivalent and the same cyclic LDPC codes are produced.

#### **Algorithm 12.2** MSCodeSearch(**V**, *index*)

#### **Input:**

**V** ⇐ a vector initialised to ∅ *index* ⇐ an integer initialised to −1 *Rmin* ⇐ minimum code rate of interest *d* ⇐ lowest expected minimum distance

δ ⇐ small positive integer


**Output:**

**CodesList** contains set of codes

1: **T** ⇐ **V** 2: **for** *i*=*index*+1; *i* ≤ |*<sup>I</sup>* |; *i*++ **do** 3: **T**prev ⇐ **T** 4: **if** - <sup>∀</sup> *<sup>j</sup>*∈**<sup>T</sup>** deg( *<sup>f</sup> <sup>j</sup>*(*x*)) <sup>+</sup> deg( *fi*(*x*)) <sup>≤</sup> <sup>√</sup>*<sup>n</sup>* <sup>+</sup> <sup>δ</sup> **then** 5: Append *i* to **T** 6: θ (*z*) ⇐ - <sup>∀</sup> *<sup>j</sup>*∈**<sup>T</sup>** <sup>θ</sup> *<sup>j</sup>*(*z*) 7: **if** wt*<sup>H</sup>* (θ (*z*)) ≤ (1 − *Rmin*)*n* **and** *r*<sup>θ</sup> > *d* **then** 8: *<sup>u</sup>*(*x*) ⇐ MS−<sup>1</sup> (θ (*z*)) 9: **if** *u*(*x*) is non-degenerate **then** 10: *C* ⇐ a cyclic code defined by *u*(*x*) 11: **if** *<sup>C</sup>* ∈/ **CodeList then** 12: Add **C** to **CodeList** 13: **end if** 14: **end if** 15: **end if** 16: MSCodeSearch(**T**, *i*) 17: **end if** 18: **T** ⇐ **T**prev 19: **end for**

Some good cyclic LDPC codes with cycles of length 4 found using Algorithm 12.2, which may also be found using Algorithm 12.1, are tabulated in Table 12.2. A check based on Lemma 12.1 may be easily incorporated in Step 12 of Algorithm 12.2 to filter out cyclic codes whose Tanner graph has girth of 4.

Figure 12.3 demonstrates the FER performance of several cyclic LDPC codes found by Algorithm 12.2. It is assumed that binary antipodal signalling is employed and the iterative decoder uses the RVCM algorithm described by Papagiannis et al. [23]. The FER performance is compared against the sphere packing lower bound offset for binary transmission. We can see that the codes [127, 84, 10] and [127, 99, 7], despite having cycles of length 4, are around 0.3 dB from the offset sphere packing lower bound at 10−<sup>4</sup> FER. Figure 12.3c compares two LDPC codes of block size 255 and dimension 175, an algebraic code obtained by Algorithm 12.2 and an irregular code constructed using the PEG algorithm [10]. It can be seen that, in addition to having improved minimum Hamming distance, the cyclic LDPC code is 0.4 dB superior to the irregular code, and compared to the offset sphere packing lower bound, it is within 0.25 dB away at 10−<sup>4</sup> FER. The effect of the error floor is apparent in the FER performance of the [341, 205, 6] irregular LDPC code, as


**Table 12.2** Several good cyclic LDPC codes with girth of 4

**Fig. 12.3** FER performance of some binary cyclic LDPC codes

shown in Fig. 12.3d. The floor of this irregular code is largely attributed to minimum Hamming distance error events. Whilst this irregular code, at low SNR region, has better convergence than does the algebraic LDPC code of the same block length and dimension, the benefit of having higher minimum Hamming distance is obvious as the SNR increases. The [341, 205, 16] cyclic LDPC code is approximately 0.8 dB away from the offset sphere packing lower bound at 10−<sup>4</sup> FER.

It is clear that short block length (*n* ≤ 350) cyclic LDPC codes have outstanding performance and the gap to the offset sphere packing lower bound is relatively close. However, as the block length increases, the algebraic LDPC codes, although these code have large minimum Hamming distance, have a convergence issue, and the threshold to the waterfall region is at larger *Eb*/*N*0. The convergence problem arises because as the minimum Hamming distance increases, the weight of the idempotent *u*(*x*), which defines the parity-check matrix, also increases. In fact, if *u*(*x*) satisfies Lemma 12.1, we know that wt*<sup>H</sup>* (*u*(*x*)) = *d* − 1, where *d* is the minimum Hamming distance of the code. Large values of wt*<sup>H</sup>* (*u*(*x*)) result in a parity-check matrix that is not as sparse as that of a good irregular LDPC code of the same block length and dimension.

# *12.2.2 Non-Binary Extension of the Cyclotomic Coset-Based LDPC Codes*

The code construction technique for the cyclotomic coset-based binary cyclic LDPC codes, which is discussed in Sect. 4.4, may be extended to non-binary fields. Similar to the binary case, the non-binary construction produces the dual-code idempotent which is used to define the parity-check matrix of the associated LDPC code.

Let *m* and *m* be positive integers with *m* | *m* , so that F2*<sup>m</sup>* is a subfield of F2*<sup>m</sup>* . Let *<sup>n</sup>* be a positive odd integer and <sup>F</sup>2*<sup>m</sup>* be the splitting field of *<sup>x</sup> <sup>n</sup>* <sup>−</sup> 1 over <sup>F</sup>2*<sup>m</sup>* , so that *n*|2*<sup>m</sup>* −1. Let*r* = (2*<sup>m</sup>* −1)/*n*, *l* = (2*<sup>m</sup>* <sup>−</sup>1)/(2*<sup>m</sup>* <sup>−</sup>1), <sup>α</sup> be a generator for <sup>F</sup>2*<sup>m</sup>* and <sup>β</sup> be a generator of <sup>F</sup>2*<sup>m</sup>* , where <sup>β</sup> <sup>=</sup> <sup>α</sup>*<sup>l</sup>* . Let *Ta*(*x*) be the set of polynomials of degree at most *<sup>n</sup>* <sup>−</sup> 1 with coefficients in <sup>F</sup>2*<sup>a</sup>* . For the case of *<sup>a</sup>* <sup>=</sup> 1, we may denote *T*1(*x*) by *T* (*x*) for convenience.

The Mattson–Solomon polynomial and its corresponding inverse, (11.1) and (11.2), respectively, may be redefined as

$$A(z) = \text{MS}\,(a(\mathbf{x})) = \sum\_{j=0}^{n-1} a(\alpha^{-rj}) z^j \tag{12.5}$$

$$a(\mathbf{x}) = \mathbf{M} \mathbf{S}^{-1}(A(\mathbf{z})) = \frac{1}{n} \sum\_{i=0}^{n-1} A(\alpha^{r^i}) \mathbf{x}^i \tag{12.6}$$

where *a*(*x*) ∈ *Tm* (*x*) and *A*(*z*) ∈ *Tm* (*z*).

Recall that a polynomial *e*(*x*) ∈ *Tm*(*x*) is termed an idempotent if the property *e*(*x*) = *e*(*x*)<sup>2</sup> (mod *x <sup>n</sup>* −1)is satisfied. Note that *e*(*x*) = *e*(*x* <sup>2</sup>) (mod *x <sup>n</sup>* −1) unless *m* = 1. The following definition shows how to construct an idempotent for binary and non-binary polynomials.

**Definition 12.9** (*Cyclotomic Idempotent*) Assume that *N* be a set as defined in Sect. 4.4, let *s* ∈ *N* and let *Cs*,*<sup>i</sup>* represent the (*i* +1)th element of *Cs*, the cyclotomic coset of *s* (mod *n*). Assume that the polynomial *es*(*x*) ∈ *Tm*(*x*) is given by

$$e\_s(\mathbf{x}) = \sum\_{0 \le i \le |C\_i| - 1} e\_{C\_{x,i}} \mathbf{x}^{C\_{x,i}},\tag{12.7}$$

where |*Cs*| is the number of elements in *Cs*. In order for *es*(*x*) to be an idempotent, its coefficients may be chosen in the following manner:

$$\text{(i)}\quad\text{if}\,m=1,e\_{C\_{x^j}}=1,$$

(ii) otherwise, *eCs*,*<sup>i</sup>* is defined recursively as follows:

$$\begin{array}{l}\text{for } i = 0, \ e\_{C\_{x,i}} \in \{1, \beta, \beta^2, \dots, \beta^{2^m - 2}\},\\\text{for } i > 0, \ e\_{C\_{x,i}} = e\_{C\_{x,i-1}}^2.\end{array}$$

We refer to the idempotent *es*(*x*) as a cyclotomic idempotent.

**Definition 12.10** (*Parity-Check Idempotent*) Let *M* ⊆ *N* and let *u*(*x*) ∈ *Tm*(*x*) be

$$u(\mathbf{x}) = \sum\_{\mathbf{s} \in \mathcal{A}} e\_{\mathbf{s}}(\mathbf{x}). \tag{12.8}$$

The polynomial *u*(*x*) is an idempotent and it is called a parity-check idempotent.

As in Sect. 4.4, the parity-check idempotent *u*(*x*) is used to define the F2*<sup>m</sup>* cyclic LDPC code over <sup>F</sup>2*<sup>m</sup>* , which may be denoted by [*n*, *<sup>k</sup>*, *<sup>d</sup>*]2*<sup>m</sup>* . The parity-check matrix consists of *n* cyclic shifts of *x* deg(*u*(*x*))*u*(*x*−<sup>1</sup>). For the non-binary case, the minimum Hamming distance *d* of the cyclic code is bounded by

$$d\_0 + 1 \le d \le \min\left(\text{wt}\_H(\mathbf{g}(\mathbf{x})), \text{wt}\_H(1 + \mu(\mathbf{x}))\right),$$

where *d*<sup>0</sup> is the maximum run of consecutive ones in *U*(*z*) = MS(*u*(*x*)), taken cyclically mod *n*.

Based on the description given above, a procedure to construct a cyclic LDPC code over F2*<sup>m</sup>* is as follows.


$$p(\mathbf{x}) = \prod\_{0 \le i \le m} \left( \mathbf{x} + \boldsymbol{\alpha}^{C'\_{i,i}} \right). \tag{12.9}$$

Construct all elements of F2*<sup>m</sup>* using *p*(*x*) as the primitive polynomial.


*Example 12.3* Consider the construction of a *<sup>n</sup>* <sup>=</sup> 21 cyclic LDPC code over <sup>F</sup><sup>26</sup> . The splitting field of *<sup>x</sup>* <sup>21</sup> <sup>−</sup> 1 over <sup>F</sup><sup>26</sup> is <sup>F</sup><sup>26</sup> , and this implies that *<sup>m</sup>* <sup>=</sup> *<sup>m</sup>* <sup>=</sup> 6, *r* = 3 and *l* = 1. Let *C* and *C* denote the cyclotomic cosets modulo *n* and 2*<sup>m</sup>* − 1, respectively. We know that |*C* <sup>1</sup>| = 6 and therefore the primitive polynomial *p*(*x*) has roots of α *<sup>j</sup>* , for all *j* ∈ *C* 1, i.e. *p*(*x*) = 1 + *x* + *x* 6. By letting 1 + β + β<sup>6</sup> = 0, all of the elements of F<sup>26</sup> can be defined. If *u*(*x*) is the parity-check idempotent generated by the sum of the cyclotomic idempotents defined by *Cs*, where *s* ∈ {*M* : 5, 7, 9} and *eCs*,<sup>0</sup> for all *s* ∈ *M* be β23, 1 and 1, respectively,

$$\begin{split} u(\mathbf{x}) &= \boldsymbol{\beta}^{23} \mathbf{x}^{\ $} + \mathbf{x}^{\prime} + \mathbf{x}^{\prime} + \boldsymbol{\beta}^{46} \mathbf{x}^{10} + \boldsymbol{\beta}^{43} \mathbf{x}^{13} + \mathbf{x}^{14} + \mathbf{x}^{15} + \boldsymbol{\beta}^{\$ 3} \mathbf{x}^{17} + \mathbf{x}^{18} \\ &+ \boldsymbol{\beta}^{\ $8} \mathbf{x}^{19} + \boldsymbol{\beta}^{\$ 29} \mathbf{x}^{20} \end{split}$$

and its Mattson–Solomon polynomial *U*(*z*)indicates that it is a [21, 15, ≥ 5]<sup>26</sup> cyclic code, whose binary image is a [126, 90, 8] linear code.

The following systematic search algorithm is based on summing each possible combination of the cyclotomic idempotents to search for all possible F2*<sup>m</sup>* cyclic codes of a given length. As in Algorithm 12.2, the search algorithm targets the following key parameters:

#### 1. **Sparseness of the resulting parity-check matrix**

Since the parity-check matrix is directly derived from *u*(*x*) which consists of the sum of the cyclotomic idempotents, only low-weight cyclotomic idempotents are of interest. Let *Wmax* be the maximum wt*<sup>H</sup>* (*u*(*x*)); then the search algorithm will only choose the cyclotomic idempotents whose sum has total weight less than or equal to *Wmax* .

#### 2. **High code rate**

The number of roots of *u*(*x*) which are also roots of unity define the dimension of the resulting LDPC code. Let the integer *kmin* be defined as the minimum code dimension, and the cyclotomic idempotents that are of interest are those whose Mattson–Solomon polynomial has at least *kmin* zeros.

#### 3. **High minimum Hamming distance**

Let the integer *d* be the smallest value of the minimum Hamming distance of the code. The sum of the cyclotomic idempotents should have at least *d* − 1 consecutive powers of β which are roots of unity but not roots of *u*(*x*).


**Table 12.3** Examples of [*n*, *k*, *d*]2*<sup>m</sup>* cyclic LDPC codes

†The minimum Hamming distance of the binary image which has been determined using the improved Zimmermann algorithm, Algorithm 5.1

Following Definition 12.10 and the Mattson–Solomon polynomial

$$U(z) = \text{MS}\left(\sum\_{\mathbf{x}\in\mathcal{A}} e\_{\mathbf{s}}(\mathbf{x})\right) = \sum\_{\mathbf{s}\in\mathcal{A}} E\_{\mathbf{s}}(z),$$

it is possible to maximise the run of the consecutive ones in *U*(*z*) by varying the coefficients of *es*(*x*). It is therefore important that all possible non-zero values of *eCs*,<sup>0</sup> for all *s* ∈ *M* are included to guarantee that codes with the highest possible minimum Hamming distance are found.

Table 12.3 outlines some examples of [*n*, *k*, *d*]2*<sup>m</sup>* cyclic LDPC codes. The nonbinary algebraic LDPC codes in this table perform well under iterative decoding as shown in Fig. 12.4 assuming binary antipodal signalling and the AWGN channel. The RVCM algorithm is employed in the iterative decoder. The FER performance of these non-binary codes is compared to the offset sphere packing lower bound in Fig. 12.4.

As mentioned in Sect. 12.1.2, there is an inverse relationship between the convergence of the iterative decoder and the minimum Hamming distance of a code. The algebraic LDPC codes, which have higher minimum Hamming distances compared to irregular LDPC codes, do not converge well at long block lengths. It appears that

**Fig. 12.4** FER performance of some non-binary cyclic LDPC codes

**Fig. 12.5** FER performance of algebraic and irregular LDPC codes of rate 0.6924 and code length 5461 bits

the best convergence at long code lengths can only be realised by irregular LDPC codes with good degree distributions. Figure 12.5 shows the performance of two LDPC codes of block length 5461 bits and code rate 0.6924; one is an irregular code constructed using the PEG algorithm and the other one is an algebraic code of minimum Hamming distance 43 based on cyclotomic coset and idempotent construction (see Table 12.1). These results are for the AWGN channel using binary antipodal signalling with a belief propagation iterative decoder featuring 100 iterations. We can see that at 10−<sup>5</sup> FER, the irregular PEG code is superior by approximately 1.6 dB compared to the algebraic cyclic LDPC code. However, for short code lengths, algebraic LDPC codes are superior. The codes have better performance and have simpler encoders than ad hoc designed LDPC codes.

# **12.3 Irregular LDPC Codes from Progressive Edge-Growth Construction**

It is shown by Hu et al. [11] that LDPC codes obtained using the PEG construction method can perform better than other types of randomly constructed LDPC codes. The PEG algorithm adds edges to each vertex such that the local girth is maximised. The PEG algorithm considers only the variable degree sequence, and the check degree

**Fig. 12.6** Effect of vertex degree ordering in PEG algorithm

sequence is maintained to be as uniform as possible. In this section, the results of experimental constructions of irregular LDPC codes using the PEG algorithm are presented. Analysis on the effects of the vertex degree ordering and degree sequences have been carried out by means of computer simulations. All simulation results in this section, unless otherwise stated, were obtained using binary antipodal signalling with the belief propagation decoder using 100 iterations. Each simulation run was terminated after the decoder had produced 100 erroneous frames.

Figure 12.6 shows the FER performance of various [2048, 1024] irregular LDPC codes constructed using the PEG algorithm with different vertex degree orderings. These LDPC codes have variable degree sequence Λλ(*x*) = 0.475*x* <sup>2</sup> + 0.280*x* <sup>3</sup> + 0.035*x* <sup>4</sup> + 0.109*x* <sup>5</sup> + 0.101*x* 15. Let (*v*0, *v*1,..., *vi*,..., *vn*−<sup>1</sup>) be a set of variable vertices of an LDPC code. Code 0 and Code 1 LDPC codes were constructed with an increasing vertex degree ordering, i.e. deg(*v*0) ≤ deg(*v*1) ≤ ··· ≤ deg(*vn*−<sup>1</sup>), whereas the remaining LDPC codes were constructed with random vertex degree ordering.

Figure 12.6 clearly shows that, unless the degree of the variable vertices is assigned in an increasing order, poor LDPC codes are obtained. In random degree ordering of half rate codes, it is very likely to encounter the situation where, as the construction approaches the end, there are some low-degree variable vertices that have no edge connected to them. Since almost all of the variable vertices would have already had edges connected to them, the low-degree variable vertices would not have many choice of edges to connect in order to maximise the local girth. It has been observed that, in many cases, these low-degree variable vertices are connected to each other, forming a cycle which involves all vertices, and the resulting LDPC codes often have a low minimum Hamming distance. If *d* variable vertices are connected to each other and a cycle of length 2*d* is formed, then the minimum Hamming distance of the resulting code is *d* because the sum of these *d* columns in the corresponding parity-check matrix *H* is **0***<sup>T</sup>* .

In contrast, for the alternative construction which starts from an increasing degree of the variable vertices, edges are connected to the low-degree variable vertices earlier in the process. Short cycles, which involve the low-degree variable vertices and lead to low minimum Hamming distance, may be avoided by ensuring these lowdegree variable vertices have edges connected to the parity-check vertices which are connected to high-degree variable vertices.

It can be expected that the PEG algorithm will almost certainly produce poor LDPC codes if the degree of the variable vertices is assigned in descending order. It is concluded that all PEG-based LDPC codes should be constructed with increasing variable vertex degree ordering.

Figure 12.7 shows the effect of low-degree variable vertices, especially λ<sup>2</sup> and λ3, on the FER performance of various [512, 256] PEG-constructed irregular LDPC codes. Table 12.4 shows the variable degree sequences of the simulated irregular codes. Figure 12.7 indicates that, with the fraction of high-degree variable vertices kept constant, the low-degree variable vertices have influence over the convergence

**Fig. 12.7** Effect of low-degree variable vertices


**Table 12.4** Variable degree sequences for codes in Fig. 12.7

in the waterfall region. As the fraction of low-degree variable vertices is increased, the FER in the low signal-to-noise ratio (SNR) region improves. On the other hand, LDPC codes with a high fraction of low-degree variable vertices tend to have low minimum Hamming distance and as expected, these codes exhibit early error floors. This effect is clearly depicted by Code 7 and Code 8, which have the highest fraction of low-degree variable vertices among all the codes in Fig. 12.7. Of all of the codes, Code 6 and Code 24 appear to have the best performance.

Figure 12.8 demonstrates the effect of high-degree variable vertices on the FER performance. These codes are rate 3/4 irregular LDPC codes of length 1024 bits with the same degree sequences, apart from their maximum variable vertex degree. One group has maximum degree of 8 and the other group has maximum degree of 12. From Fig. 12.8, it is clear that the LDPC codes with maximum variable vertex degree of 12 converge better under iterative decoding than those codes with maximum variable vertex degree of 8.

In a similar manner to Fig. 12.7, the effect of having various low-degree variable vertices is also demonstrated in Fig. 12.9. In this case, the LDPC codes are constructed to have the advantageous linear-time encoding complexity, where the parity symbols are commonly described as having a zigzag pattern [26]. In this case, λ<sup>1</sup> and λ<sup>2</sup> of these LDPC codes are fixed and the effect of varying λ3, λ<sup>4</sup> and λ<sup>5</sup> is investigated.

**Fig. 12.8** Effect of high-degree variable vertices

**Fig. 12.9** Effect of varying low-degree variable vertices


**Table 12.5** Variable degree sequences of LDPC codes in Fig. 12.9

**Fig. 12.10** Effect of varying high-degree variable vertices



12.3 Irregular LDPC Codes from Progressive Edge-Growth Construction 343

The variable degree sequences of the LDPC codes under investigation, which are rate 3/4 codes of length 1600 bits, are depicted in Table 12.5. The results show that, as in the previous cases, these low-degree variable vertices contribute to the waterfall region of the FER curve. The contribution of λ*<sup>i</sup>* is more significant than that of λ*<sup>i</sup>*+<sup>1</sup> and this may be observed by comparing the FER curves of Code 1 with either Code 3 or Code 4, which has λ<sup>3</sup> of 0.0. We can also see that Code 0, which has the most variable vertices of low degree, exhibits a high error floor.

In contrast to Fig. 12.9, Fig. 12.10 shows the effect of varying high-degree variable vertices. The LDPC codes considered here all have the same code rate and code length as those in Fig. 12.9 and their variable degree sequences are shown in Table 12.6. The results show that


## **12.4 Quasi-cyclic LDPC Codes and Protographs**

Despite irregular LDPC codes having lower error rates than their regular counterparts, Luby et al. [18], the extra complexity of the encoder and decoder hardware structure, has made this class of LDPC codes unattractive from an industry point of view. In order to encode an irregular code which has a parity-check matrix *H*, Gaussian elimination has to be done to transform this matrix into reduced echelon form. Irregular LDPC codes, as shown in Sect. 12.3, may also be constructed by constraining the *n* − *k* low-degree variable vertices of the Tanner graph to form a zigzag pattern, as pointed out by Ping et al. [26]. Translating these *n* − *k* variable vertices of the Tanner graph into matrix form, we have

$$\mathbf{H}\_p = \begin{bmatrix} 1 \\ 1 \ 1 \\ \vdots \ \vdots \\ 1 \ 1 \\ \vdots \ 1 \ 1 \end{bmatrix}. \tag{12.10}$$

The matrix *H <sup>p</sup>* is non-singular and the columns of this matrix may be used as the coordinates of the parity-check bits of an LDPC code.

The use of zigzag parity checks does simplify the derivation of the encoder as the Gaussian elimination process is no longer necessary and encoding, assuming that

$$\begin{aligned} H &= \left[ H\_u | H\_p \right] \\ &= \begin{vmatrix} \nu\_0 & \nu\_1 & \dots & \nu\_{k-2} & \nu\_{k-1} & \nu\_k \ \nu\_{k+1} & \dots & \nu\_{n-2} & \nu\_{n-1} \\ \hline \mu\_{0,0} & \mu\_{0,1} & \dots & \mu\_{0,k-2} & \mu\_{0,k-1} & 1 \\ \mu\_{1,0} & \mu\_{1,1} & \dots & \mu\_{1,k-2} & \mu\_{1,k-1} & 1 & 1 \\ \vdots & \vdots & \vdots & \vdots & \vdots & \ddots & \ddots \\ \hline \mu\_{n-k-2,0} & \mu\_{n-k-2,1} & \dots & \mu\_{n-k-2,k-2} & \mu\_{n-k-2,k-1} & 1 \\ \mu\_{n-k-1,0} & \mu\_{n-k-1,1} & \dots & \mu\_{n-k-1,k-2} & \mu\_{n-k-1,k-1} & 1 \end{vmatrix}, \end{aligned}$$

can be performed by calculating parity-check bits as follows:

$$\begin{aligned} \nu\_k &= \sum\_{j=0}^{k-1} \nu\_j \mu\_{0,j} \pmod{2} \\ \nu\_i &= \nu\_{i-1} + \sum\_{j=0}^{k-1} \nu\_j \mu\_{i-k,j} \pmod{2} \qquad \text{for } k+1 \le i \le n-1 \dots \end{aligned}$$

Nevertheless, zigzag parity bit checks do not lead to a significant reduction in encoder storage space as the matrix *H<sup>u</sup>* still needs to be stored. It is necessary to introduce additional structure in *Hu*, such as using a quasi-cyclic property, to reduce significantly the storage requirements of the encoder.

#### *12.4.1 Quasi-cyclic LDPC Codes*

Quasi-cyclic codes have the property that each codeword is a *m*-sized cyclic shift of another codeword, where *m* is an integer. With this property simple feedback shift registers may be used for the encoder. This type of code is known as circulant codes defined by circulant polynomials and depending on the polynomials can have significant mathematical structure as described in Chap. 9. A circulant matrix is a square matrix where each row is a cyclic shift of the previous row and the first row is the cyclic shift of the last row. In addition, each column is also a cyclic shift of the previous column and the column weight is equal to the row weight.

A circulant matrix is defined by a polynomial *r*(*x*). If *r*(*x*) has degree <*m*, the corresponding circulant matrix is an *m*×*m* square matrix. Let *R* be a circulant matrix defined by *r*(*x*), then *M* is of the form

$$\mathbf{R} = \begin{bmatrix} r(\mathbf{x}) \pmod{\boldsymbol{x}^m - 1} \\ \boldsymbol{x}r(\mathbf{x}) \pmod{\boldsymbol{x}^m - 1} \\ \vdots \\ \boldsymbol{x}^ir(\mathbf{x}) \pmod{\boldsymbol{x}^m - 1} \\ \vdots \\ \boldsymbol{x}^{m-1}r(\mathbf{x}) \pmod{\boldsymbol{x}^m - 1} \end{bmatrix} \tag{12.11}$$

.

where the polynomial in each row can be represented by an *m*-dimensional vector, which contains the coefficients of the corresponding polynomial. A quasi-cyclic code can be built from the concatenation of circulant matrices to define the generator or parity-check matrix.

*Example 12.4* A quasi-cyclic code with defining polynomials *r*1(*x*) = 1 + *x* + *x* <sup>3</sup> and *r*2(*x*) = 1+*x* <sup>2</sup>+*x* 5, where both polynomials have degree less than the maximum degree of 6, produces a parity-check matrix in the following form:

$$H = \begin{bmatrix} 1 \ 1 \ 0 \ 1 \ 0 \ 0 \ 0 \ 1 \ 0 \ 1 \ 0 \ 1 \ 0 \ 1 \ 0 \ 1 \ 0 \\ 0 \ 1 \ 1 \ 0 \ 1 \ 0 \ 0 \ 0 \ 1 \ 0 \ 1 \ 0 \ 1 \ 0 \ 0 \ 1 \\ 0 \ 0 \ 1 \ 1 \ 0 \ 1 \ 0 \ 1 \ 0 \ 1 \ 0 \ 1 \ 0 \ 0 \\ 0 \ 0 \ 0 \ 1 \ 1 \ 0 \ 1 \ 0 \ 1 \ 0 \ 1 \ 0 \ 1 \ 0 \\ 1 \ 0 \ 0 \ 0 \ 1 \ 1 \ 0 \ 0 \ 0 \ 1 \ 0 \ 1 \ 0 \ 1 \\ 0 \ 1 \ 0 \ 0 \ 0 \ 1 \ 1 \ 1 \ 0 \ 0 \ 1 \ 0 \ 1 \ 0 \\ 1 \ 0 \ 1 \ 0 \ 0 \ 0 \ 1 \ 0 \ 1 \ 0 \ 1 \ 0 \ 1 \ 0 \ 1 \end{bmatrix}$$

**Definition 12.11** (*Permutation Matrix*) A permutation matrix is a type of circulant matrix where each row or column has weight of 1. A permutation matrix, which is denoted by *P<sup>m</sup>*,*<sup>j</sup>* , has *r*(*x*) = *x <sup>j</sup>* (mod *x<sup>m</sup>* − 1) as the defining polynomial and it satisfies the property that *P*<sup>2</sup> *<sup>m</sup>*,*<sup>j</sup>* = *Im*, where *I<sup>m</sup>* is an *m* × *m* identity matrix.

Due to the sparseness of the permutation matrix, these are usually used to construct quasi-cyclic LDPC codes. The resulting LDPC codes produce a parity-check matrix in the following form:

$$H = \begin{bmatrix} \mathbf{P}\_{m,O\_{0,0}} & \mathbf{P}\_{m,O\_{0,1}} & \dots & \mathbf{P}\_{m,O\_{0,t-1}} \\ \mathbf{P}\_{m,O\_{1,0}} & \mathbf{P}\_{m,O\_{1,1}} & \dots & \mathbf{P}\_{m,O\_{1,t-1}} \\ & \vdots & \vdots & \vdots & \vdots \\ \mathbf{P}\_{m,O\_{s-1,0}} & \mathbf{P}\_{m,O\_{s-1,1}} & \dots & \mathbf{P}\_{m,O\_{s-1,t-1}} \end{bmatrix} \tag{12.12}$$

From (12.12), we can see that there exists a *s* × *t* matrix, denoted by *O*, in *H*. This matrix is called an *offset matrix* and it represents the exponent of *r*(*x*) in each permutation matrix, i.e.

$$\mathcal{O} = \begin{bmatrix} O\_{0,0} & O\_{0,1} & \dots & O\_{0,t-1} \\ O\_{1,0} & O\_{1,1} & \dots & O\_{1,t-1} \\ \vdots & \vdots & \vdots & \vdots \\ O\_{s-1,0} & O\_{s-1,1} & \dots & O\_{s-1,t-1} \end{bmatrix}$$

where 0 ≤ *Oi*,*<sup>j</sup>* ≤ *m*−1, for 0 ≤ *i* ≤ *s*−1 and 0 ≤ *j* ≤ *t*−1. The permutation matrix *Pm*,*<sup>j</sup>* has *m* rows and *m* columns, and since the matrix *H* contains *s* and *t* of these matrices per row and column, respectively, the resulting code is a [*mt*, *m*(*t* − *s*), *d*] quasi-cyclic LDPC code over F2.

In general, some of the permutation matrices *Pi*,*<sup>j</sup>* in (12.12) may be zero matrices. In this case, the resulting quasi-cyclic LDPC code is irregular and *Oi*,*<sup>j</sup>* for which *Pi*,*<sup>j</sup>* = *O* may be ignored. If none of the permutation matrices in (12.12) is a zero matrix, the quasi-cyclic LDPC code defined by (12.12) is a (*s*, *t*)regular LDPC code.

# *12.4.2 Construction of Quasi-cyclic Codes Using a Protograph*

A protograph is a miniature prototype Tanner graph of arbitrary size, which can be used to construct a larger Tanner graph by means of replicate and permute operations as discussed by Thorpe [32]. A protograph may also be considered as an [*n* , *k* ] linear code *P* of small block length and dimension. A longer code may be obtained by expanding code *P* by an integer factor *Q* so that the resulting code has parameter [*n* = *n Q*, *k* = *k Q*] over the same field. A simplest way to expand code *P* and also to impose structure in the resulting code is by replacing a non-zero element of the parity-check matrix of code *P* with a *Q* × *Q* permutation matrix, and a zero element with a *Q* × *Q* zero matrix. As a consequence, the resulting code has a quasi-cyclic structure. The procedure is described in detail in the following example.

*Example 12.5* Consider a code *<sup>P</sup>* = [4, <sup>2</sup>] over <sup>F</sup><sup>2</sup> as a protograph. The paritycheck matrix of code *P* is given by

$$H' = \begin{matrix} c\_0 & \nu\_1 & \nu\_2 & \nu\_3 \\ \hline c\_1 & 1 & 1 & 0 & 1 \\ 0 & 1 & 1 & 1 & 1 \end{matrix}. \tag{12.13}$$

Let the expansion factor *Q* = 5, the expanded code, which is a [20, 10] code, has a parity-check matrix given by

where the zero elements have been omitted. This protograph construction may also be described using the Tanner graph representation as shown in Fig. 12.11.

Initially, the Tanner graph of code *P* is replicated *Q* times. The edges of these replicated Tanner graphs are then permuted. The edges may be permuted in many ways and in this particular example, we want the permutation to produce a code which has quasi-cyclic structure. The edges shown in bold in Fig. 12.11 or equivalently the non-zeros shown in bold in (12.14) represent the code *P*.

The minimum Hamming distance of code *P* is 2 and this may be seen from its parity-check matrix, (12.13), where the summation of two column vectors, those of *v*<sup>1</sup> and *v*3, produces a zero vector. Since, in the expansion, only identity matrices are

**Fig. 12.11** Code construction using a protograph

employed, the expanded code will have the same minimum Hamming distance as the protograph code. This is obvious from (12.14) where the summation of two column vectors, those of *v*<sup>5</sup> and *v*15, produces a zero vector. In order to avoid the expanded code having low minimum Hamming distance, permutation matrices may be used instead and the parity-check matrix of the expanded code is given by (12.15).

The code defined by this parity-check matrix has minimum Hamming distance of 3. In addition, the cycle structure of the protograph is also preserved in the expanded code if only identity matrices are used for expansion. Since the protograph is such a small code, the variable vertex degree distribution required to design a good target code, which has much larger size than a protograph does, in general, causes many inevitable short cycles in the protograph. Using appropriate permutation matrices in the expansion, these short cycles may be avoided in the expanded code.

In the following, we describe a construction of a long quasi-cyclic LDPC code for application in satellite communications. The standard for digital video broadcasting (DVB), which is commonly known as DVB-S2, makes use of a concatenation of LDPC and BCH codes to protect the video stream. The parity-check matrices of DVB-S2 LDPC codes contain a zigzag matrix for the *n* − *k* parity coordinates and quasi-cyclic matrices on the remaining *k* coordinates. In the literature, the code with this structure is commonly known as the irregular repeat accumulate (IRA) code [12].

The code construction described below, using a protograph and greedy PEG expansion, is aimed at improving the performance compared to the rate 3/4 DVB-S2 LDPC code of block length 64800 bits. Let the [64800, 48600] LDPC code that we will construct be denoted by *C*1. A protograph code, which has parameter [540, 405], is constructed using the PEG algorithm with a good variable vertex degree distributions obtained from Urbanke [34],

$$\begin{split} A\_{\lambda\_1}(\mathbf{x}) &= \underbrace{0.00185185\mathbf{x} + 0.248148\mathbf{x}^2}\_{\text{for zigzag matrix}} + 0.55\mathbf{x}^3 + 0.0592593\mathbf{x}^5 \\ &+ 0.0925926\mathbf{x}^8 + 0.00555556\mathbf{x}^{12} + 0.00185185\mathbf{x}^{15} + 0.0166667\mathbf{x}^{19} \\ &+ 0.00185185\mathbf{x}^{24} + 0.00185185\mathbf{x}^{28} + 0.0203704\mathbf{x}^{35} .\end{split}$$

The constructed [540, 405] protograph code has a parity-check matrix *H* = [*H <sup>u</sup>* | *H <sup>p</sup>*] where *H <sup>p</sup>* is a 135 × 135 zigzag matrix, see (12.10), and *H <sup>u</sup>* is an irregular matrix satisfying Λλ<sup>1</sup> (*x*) above. In order to construct a [64800, 48600] LDPC code *C*1, we need to expand the protograph code by a factor of *Q* = 120. In expanding the protograph code, we apply the greedy approach to construct the offset matrix *O* in order to obtain a Tanner graph for the [64800, 48600] LDPC code *C*1, which has local girth maximised. This greedy approach examines all offset values, from 0 to *Q* − 1, and picks an offset that results in highest girth or if there is more than one choice, one of these is randomly chosen. A 16200 × 48600 matrix *H<sup>u</sup>* can be easily constructed by replacing a non-zero element at coordinate (*i*, *j*) in *H <sup>u</sup>* with a permutation matrix *P <sup>Q</sup>*,*Oi*,*<sup>j</sup>* . The resulting LDPC code *C*<sup>1</sup> has a parity-check matrix given by *H* = [*H<sup>u</sup>* | *H <sup>p</sup>*], where, as before, *H <sup>p</sup>* is given by (12.10).

In comparison, the rate 3/4 LDPC code of block length 64800 bits specified in the DVB-S2 standard takes a lower *Q* value, *Q* = 45. The protograph is a [1440, 1080] code which has the following variable vertex degree distributions

$$A\_{\lambda\_2}(\mathbf{x}) = \underbrace{0.000694\mathbf{x} + 0.249306\mathbf{x}^2}\_{\text{for zigzag matrix}} + 0.666667\mathbf{x}^3 + 0.0833333\mathbf{x}^{12}\mathbf{x}$$

For convenience, we denote the DVB-S2 LDPC code by *C*2.

**Fig. 12.12** FER performance of the DVB-S2 and the designed [64800, 48600] LDPC codes

Figure 12.12 compares the FER performance of *C*<sup>1</sup> and *C*<sup>2</sup> using the belief propagation decoder with 100 iterations. Binary antipodal signalling and AWGN channel are assumed. Note that, although the outer concatenation of BCH code is not used, there is still no sign of an error floor at FER as low as 10−<sup>6</sup> which means that the BCH code is no longer required. It may be seen from Fig. 12.12 that the designed LDPC code, which at 10−<sup>5</sup> FER performs approximately 0.35 dB away from the sphere packing lower bound offset for binary transmission loss, is 0.1 dB better than the DVB-S2 code.

## **12.5 Summary**

The application of cyclotomic cosets, idempotents and Mattson–Solomon polynomials has been shown to produce many binary cyclic LDPC codes whose parity-check equations are orthogonal in each position. Whilst some of these excellent cyclic codes have the same parameters as the known class of finite geometry codes, other codes are new. A key feature of this construction technique is the incremental approach to the minimum Hamming distance and the sparseness of the resulting parity-check matrix of the code. Binary cyclic LDPC codes may also be constructed by considering idempotents in the Mattson–Solomon domain. This approach has provided a different insight into the cyclotomic coset-based construction. It has also been shown that, for short algebraic LDPC codes, the myths of codes which have cycles of length 4 in their Tanner graph do not converge well with iterative decoding is not necessarily true. It has been demonstrated that the cyclotomic coset-based construction can be easily extended to produce good non-binary algebraic LDPC codes.

Good irregular LDPC codes may be constructed using the progressive edgegrowth algorithm. This algorithm adds edges to the variable and check vertices in a way that maximises the local girth. Many code results have been presented showing the effects of choosing different degree distributions. Guidelines are given for designing the best codes.

Methods of producing structured LDPC codes, such as those which have quasicyclic structure, have been described. These are of interest to industry due to the simplification of the encoder and decoder. An example of such a construction to produce a (64800, 48600) LDPC code, using a protograph, has been presented along with performance results using iterative decoding. Better results are obtained with this code than the (64800, 48600) LDPC code used in the DVB-S2 standard.

## **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the book's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the book's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Part III Analysis and Decoders**

This part is about the analysis of codes in terms of their codeword and stopping set weight spectrum and various types of decoders. Decoders are described, which include hard and soft decision decoders for the AWGN channel and decoders for the erasure channel. Universal decoders are discussed, which are decoders that can be used with any linear code for hard or soft decision decoding. One such decoder is based on the Dorsch decoder and this is described in some detail together with its performance using several different code examples. Other decoders such as the iterative decoder require sparse parity-check matrices and codes specifically designed for this type of decoder. Also included in this part is a novel concatenated |*u*|*u* +*v*| code arrangement featuring multiple near maximum likelihood decoders for an optimised matching of codes and decoders. With some outstanding codes as constituent codes, the concatenated coding arrangement is able to outperform the best LDPC and turbo coding systems with the same code parameters.

# **Chapter 13 An Exhaustive Tree Search for Stopping Sets of LDPC Codes**

#### **13.1 Introduction and Preliminaries**

The performance of all error-correcting codes is determined by the minimum Hamming distance between codewords. For codes which are iteratively decoded such as LDPC codes and turbo codes, the performance of the codes for the erasure channel is determined by the stopping set spectrum, the weight (and number) of erasure patterns which cause the iterative decoder to fail to correct all of the erasures. Codes which perform poorly on the erasure channel do not perform well on the AWGN channel. To determine all of the stopping sets of a general (*n*, *k*) code is a prohibitive task, for example, a binary (1000, 700) code would require evaluation of 2<sup>1000</sup> possible stopping sets. It should be noted by the reader that all codewords are also stopping sets, but most stopping sets are not codewords. Fortunately the properties of particular types of codes may be used to reduce considerably the scale of the task, and in particular codes with sparse parity-check matrices such as LDPC codes and turbo codes are amenable to analysis in practice. As the tree search is exhaustive, the emphasis is first on focusing the search so that only low-weight stopping sets are found, up to a specified weight, and second the emphasis is on the efficiency of the algorithms involved.

In a landmark paper in 2007, Rosnes and Ytrehus [7] showed that exhaustive, lowweight stopping set analysis of codes whose parity-check matrix is sparse is feasible using a bounded tree search over the length of the code with no distinction between information and parity bits. A previous paper on the same topic of an exhaustive search of stopping sets of LDPC codes by Wang et al. [2] used a different and much less efficient algorithm. In common with this earlier research, we use similar notation in the following preliminaries.

The code *C* is defined to be binary and linear of length *n* and dimension *k* and is a *k*-dimensional subspace of {0, 1}*<sup>n</sup>*, and may be specified as the null space of a *m* ×*n* binary parity-check matrix **H** of rank *n* − *k*. The number of parity-check equations, *m* of **H** satisfies *m* ≥ (*n* − *k*), although there are, of course, only *n* − *k* independent parity-check equations. It should be noted, as illustrated in the results below, that the © The Author(s) 2017 357

M. Tomlinson et al., *Error-Correction Coding and Decoding*, Signals and Communication Technology, DOI 10.1007/978-3-319-51103-0\_13

number of parity-check equations *m* in excess of *n* −*k* can have a dramatic effect on the stopping set weight spectrum, excluding codewords of course, as these are not affected.

As in [7], *S* is used to denote a subset of {0, 1}*<sup>n</sup>*, the set of all binary vectors of length *n*. At any point in the tree search, a constraint set, *F* is defined consisting of bit positions *pi* and the states of these bit positions *spi* , *spi* ∈ {0, 1}*<sup>n</sup>*. The support set χ (*F*) of the constraint set, *F*, is the set of positions where *spi* = 1, and the Hamming weight of *F* is the number of such positions. The sub-matrix **H**χ (*F*) consists of all the columns of **<sup>H</sup>** where *spi* <sup>=</sup> 1, and the row weight of <sup>H</sup>χ (*F*) is the number of 1 *s* in that row. An active row of **H**χ (*F*) is a row with unity row weight. It is obvious that if all rows of **H**χ (*F*) have even row weight then *F* is a codeword, noting that for an iterative decoder codewords are also stopping sets. If at least one row has odd weight, 3 or higher and there are no active rows then *F* is a stopping set but not a codeword. If there are active rows then *F* has either to be appended with additional bit positions or one or more states*spi* need to be changed to form a stopping set. With this set of basic definitions, tree search algorithms may be described which carry out an exhaustive search of {0, 1}*<sup>n</sup>* using a sequence of constraints *F* to find all stopping sets whose Hamming weight is ≤ τ .

#### **13.2 An Efficient Tree Search Algorithm**

At any given point in the search, the *constraint set F* is used to represent the set of searched known bits (up to this point) of a code *C* , which forms a *branch* of the tree in the tree search. The set of active rows in **H** is denoted by {h0, ..., h<sup>φ</sup>−1}, where φ is the total number of active rows. A constraint set *F* with size *n* is said to be *valid* if and only if there exists no active rows in **H**(*F*) . In which case the constraint set is equal to a stopping set. The pseudocode of one particularly efficient algorithm to find all the stopping sets including codeword sets with weight equal to or less than τ is given in Algorithm 13.1 below. Each time a stopping set is found, it is stored and the algorithm progresses until the entire 2*<sup>n</sup>* space has been searched.

The modified iterative decoding is carried out on a *n*-length binary input vector containing erasures in some of the positions. Let *rj*(*F*) be the rank (ones) of row *j*, *j* ∈ {0, ..., *m* − 1} for the constrained position {*pi* : (*pi*, 1) ∈ *F*} intersected by row *j* on **H**. And let *r <sup>j</sup>*(*F*) be the rank of row *j* for the unconstrained position {*pi* : (*pi*, 1) ∈ {0, ..., *n* − 1}\*F*} intersected by row *j* on **H**. The modified iterative decoding algorithm based on belief-propagation decoding algorithm over the binary erasure channel is shown in Algorithm 13.2. As noted in the line with marked (\*), the modified iterative decoder is not invoked if the condition of *rj* ≤ 1 and *r <sup>j</sup>* = 1 is not met; or the branch with constraint set *F* has condition of *rj* = 1 and *r <sup>j</sup>* = 0. This significantly speeds up the tree search. As noted in the line with marked (\*), the modified iterative decoder is not necessary to call, if the condition of *rj* ≤ 1 and


#### **repeat** Pick one untouched branch as a constraint set *F*. **if** |*F*| = *n* and *w*(*F*) ≤ τ **then** Constraint set *F* is saved, if *F* is valid **else** 1). Pass *F* to the modified iterative decoder (\*) with erasures in the unconstrained positions. 2). Construct a new constraint set *F* with new decoded positions, which is the extended branch. **if** |*F* | = *n* and *w*(*F* ) ≤ τ **then** Constraint set *F* is saved, if *F* is valid **else if** No contradiction is found in **H**(*F* ) , and *w* (*F* ) ≤ τ **then** a). Pick an unconstrained position *p*. b). Extending branch *F* to position *p* to get new branch *F* = *F* -{(*p*, 1)} and branch *F* = *F* -{(*p*, 0)}. **end if end if until** Tree has been fully explored

#### **Algorithm 13.2** Modified Iterative Decoding

Get rank **r**(*F*) and **r** (*F*) for all the equation rows on **H**. **repeat if** *rj* > 1 **then** Row *j* is flagged **else if** *rj* = 1 and *r <sup>j</sup>* = 0 **then** Contradiction → Quit decoder **else if** *rj* ≤ 1 and *r <sup>j</sup>* = 1 **then** 1). Row *j* is flagged 2). The variable bit *i* is decoded as the **XOR** of the value of *rj* . 3). Update the value of *rj* and *r <sup>j</sup>* , if *Hji* = 1. **end if until** No new unconstrained bit is decoded

*r <sup>j</sup>* = 1 is not met; or the branch with constraint set *F* can be ignored, if condition of *rj* = 1 and *r <sup>j</sup>* = 0 occurs. Thus the computing complexity can be significantly reduced than calling it for every new branch with the corresponding constraint set *F*.

#### *13.2.1 An Efficient Lower Bound*

The tree search along the current branch may be terminated if the weight necessary for additional bits to produce a stopping set plus the weight of the current constraint set *F* exceeds τ . Instead of actually evaluating these bits, it is more effective to calculate a lower bound on the weight of the additional bits. The bound uses the active rows *I* (*F*) = {*Ii*<sup>0</sup> (*F*), ..., *Iiq*−<sup>1</sup> (*F*)}, where *Ii*<sup>0</sup> (*F*) is the set of active rows with constraint set *F* corresponding to the *i*0th column **h***<sup>i</sup>*<sup>0</sup> of **H**, and *q* is the number




#### 13.2 An Efficient Tree Search Algorithm 361


**Table 13.3** WiMax 2/3*A* LDPC Codes


**Table 13.4** WiMax 2/3*B* LDPC Codes

of intersected unknown bits. Let *w*(**h***Ij*(*F*) *<sup>j</sup>* ) be the weight of ones on *j*th column of **H**, which is the number of active rows intersected with *j*th column. Under a worst case assumption, the *Ij*(*F*) with larger column weight of ones on *j*th column is always with value 1, then the active rows can be compensated by *Ij*(*F*) and the total number of active rows φ is reduced by *w*(**h***Ij*(*F*) *<sup>j</sup>* ) until φ ≤ 0. Algorithm 13.3 shows the pseudocode of computing the smallest number of intersected unknown bits *q* in order to produce no active rows. The lower bound *w* (*F*) = *w*(*F*) + *q* is the result.

**Algorithm 13.3** Simple method to find the smallest collection set of active rows

1. Arrange the set of *I* (*F*)in descending order, where **h***i* <sup>0</sup> is the column with the maximal column weight corresponding to constraint *F*. 2. *q* is initialised as 0. **while** φ > 0 **do** 1). φ is subtracted by *w*(**h***i* 0 ). 2). *q* is accumulated by 1. **end while**


**Table 13.5** WiMax 3/4*A* LDPC Codes

**Table 13.6** WiMax 3/4*B* LDPC Codes


#### *13.2.2 Best Next Coordinate Position Selection*

In the evaluation of the lower bound above, the selected unconstrained positions are assumed to have value 1. Correspondingly, the first position in the index list has maximal column weight and is the best choice for the coordinate to add to the constraint set *F*.


**Table 13.7** Weight Spectra and stopping set spectra for the WiMax LDPC Codes [1]

#### **13.3 Results**

The algorithms above have been used to evaluate all of the low-weight stopping sets for some well-known LDPC codes. The results are given in Table 13.1 together with the respective references where details of the codes may be found. The total number of stopping sets are shown for a given weight with the number of codewords in parentheses. Interestingly, the Tanner code has 93 parity-check equations, 2 more than the 91 parity-check equations needed to encode the code. If only 91 parity-check equations are used by the iterative decoder there is a stopping set of weight 12 instead of 18 which will degrade the performance of the decoder. The corollary of this is that the performance of some LDPC codes may be improved by introducing additional, dependent, parity-check equations by selecting low-weight codewords of the dual code. A subsequent tree search will reveal whether there has been an improvement to the stopping sets as a result.

## *13.3.1 WiMax LDPC Codes*

WiMax LDPC codes [1], as the IEEE 802.16e standard LDPC codes, have been fully analysed and the low-weight stopping sets for all combinations of code rates and lengths have been found. Detailed results for WiMax LDPC codes of code rates 1/2, 2/3*A*, 2/3*B*, 3/4*A*, 3/4*B* are given in Tables 13.2, 13.3, 13.4, 13.5, 13.6. In these tables, the code index *i* is linked to the code length *N* by the formula *N* = 576+96*i*. The minimum weight of non-codeword stopping sets (*sm*) and codeword stopping sets (*dm*) for all WiMax LDPC codes is given in Table 13.7.

#### **13.4 Conclusions**

An efficient algorithm has been presented which enables all of the low weight stopping sets to be evaluated for some common LDPC codes. Future research is planned that will explore the determination of efficient algorithms for use with multiple computers operating in parallel in order to evaluate all low weight stopping sets for commonly used LDPC codes several thousand bits long.

#### **13.5 Summary**

It has been shown that the indicative performance of an LDPC code may be determined from exhaustive analysis of the low-weight spectral terms of the code's stopping sets which by definition includes the low-weight codewords. In a breakthrough, Rosnes and Ytrehus demonstrated the feasibility of exhaustive, low-weight stopping set analysis of codes whose parity-check matrix is sparse using a bounded tree search over the length of the code, with no distinction between information and parity bits. For an (*n*, *k*) code, the potential total search space is of size 2*<sup>n</sup>* but a good choice of bound dramatically reduces this search space to a practical size. Indeed, the choice of bound is critical to the success of the algorithm. It has been shown that an improved algorithm can be obtained if the bounded tree search is applied to a set of *k* information bits since the potential total search space is initially reduced to size 2*<sup>k</sup>* . Since such a restriction will only find codewords and not all stopping sets, a class of bits is defined as unsolved parity bits, and these are also searched as appended bits in order to find all low-weight stopping sets. Weight spectrum results have been presented for commonly used WiMax LDPC codes in addition to some other well-known LDPC codes.

An interesting area of future research has been identified whose aim is to improve the performance of the iterative decoder, for a given LDPC code, by determining low-weight codewords of the dual code and using these as additional parity-check equations. The tree search may be used to determine improvements to the code's stopping sets as a result.

## **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the book's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the book's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Chapter 14 Erasures and Error-Correcting Codes**

#### **14.1 Introduction**

It is well known that an (*n*, *k*, *dmin*) error-correcting code *C* , where *n* and *k* denote the code length and information length, can correct *dmin* − 1 erasures [15, 16] where *dmin* is the minimum Hamming distance of the code. However, it is not so well known that the average number of erasures correctable by most codes is significantly higher than this and almost equal to *n* − *k*. In this chapter, an expression is obtained for the probability density function (PDF) of the number of correctable erasures as a function of the weight enumerator function of the linear code. Analysis results are given of several common codes in comparison to maximum likelihood decoding performance for the binary erasure channel. Many codes including BCH codes, Goppa codes, double-circulant and self-dual codes have weight distributions that closely match the binomial distribution [13–15, 19]. It is shown for these codes that a lower bound of the number of correctable erasures is *n*−*k*−2. The decoder error rate performance for these codes is also analysed. Results are given for rate 0.9 codes and it is shown for code lengths 5000 bits or longer that there is insignificant difference in performance between these codes and the theoretical optimum maximum distance separable (MDS) codes. The results for specific codes are given including BCH codes, extended quadratic residue codes, LDPC codes designed using the progressive edge growth (PEG) technique [12] and turbo codes [1].

The erasure correcting performance of codes and associated decoders has received renewed interest in the study of network coding as a means of providing efficient computer communication protocols [18]. Furthermore, the erasure performance of LDPC codes, in particular, has been used as a measure of predicting the code performance for the additive white Gaussian noise (AWGN) channel [6, 17]. One of the first analyses of the erasure correction performance of particular linear block codes is provided in a key-note paper by Dumer and Farrell [7] who derive the erasure correcting performance of long binary BCH codes and their dual codes. Dumer and Farrell show that these codes achieve capacity for the erasure channel.

# **14.2 Derivation of the PDF of Correctable Erasures**

## *14.2.1 Background and Definitions*

A set of *s* erasures is a list of erased bit positions defined as *fi* where

$$0 < i < s \quad f\_i \in \mathbf{0} \dots n-1$$

A codeword **x** = *x*0, *x*<sup>1</sup> ... *xn*−<sup>1</sup> satisfies the parity-check equations of the paritycheck matrix **H**

$$\mathbf{H} \mathbf{x}^{\mathsf{T}} = \mathbf{0}$$

A codeword with *s* erasures is defined as

$$\mathbf{x} = (\boldsymbol{\chi}\_{\boldsymbol{u}\_{\mathbb{D}}}, \boldsymbol{\chi}\_{\boldsymbol{u}\_{1}} \dots \boldsymbol{\chi}\_{\boldsymbol{u}\_{n-1-x}} | \boldsymbol{\chi}\_{\mathbb{G}}, \boldsymbol{\chi}\_{f\_{1}} \dots \boldsymbol{\chi}\_{f\_{s-1}})$$

where *xuj* are the unerased coordinates of the codeword, and the set of *s* erased coordinates is defined as **fs**. There are a total of *n* − *k* parity check equations and provided the erased bit positions correspond to independent columns of the **H** matrix, each of the erased bits may be solved using a parity-check equation derived by the classic technique of Gaussian reduction [15–17]. For maximum distance separable (MDS) codes, [15], any set of *s* erasures are correctable by the code provided that

$$s \le n - k \tag{14.1}$$

Unfortunately, the only binary MDS codes are trivial codes [15].

# *14.2.2 The Correspondence Between Uncorrectable Erasure Patterns and Low-Weight Codewords*

Provided the code is capable of correcting the set of *s* erasures, then a parity-check equation may be used to solve each erasure, viz:


where *hi*,*<sup>j</sup>* is the coefficient of row *i* and column *j* of **H**.

As the parity-check equations are Gaussian reduced, no erased bit is a function of any other erased bits. There will also be *n* − *k* − *s* remaining parity-check equations, which do not contain any of the erased bits' coordinates *xfj* :

$$h\_{s,0}\mathbf{x}\_{u\_0} + h\_{s,1}\mathbf{x}\_{u\_1} + h\_{s,2}\mathbf{x}\_{u\_2} + \dots + h\_{s,n-s-1}\mathbf{x}\_{u\_{n-s-1}} = \mathbf{0}$$

$$h\_{s+1,0}\mathbf{x}\_{u\_0} + h\_{s+1,1}\mathbf{x}\_{u\_1} + h\_{s+1,2}\mathbf{x}\_{u\_2} + \dots + h\_{s+1,n-s-1}\mathbf{x}\_{u\_{n-s-1}} = \mathbf{0}$$

$$h\_{s+2,0}\mathbf{x}\_{u\_0} + h\_{s+2,1}\mathbf{x}\_{u\_1} + h\_{s+2,2}\mathbf{x}\_{u\_2} + \dots + h\_{s+2,n-s-1}\mathbf{x}\_{u\_{n-s-1}} = \mathbf{0}$$

$$\dots$$

$$\dots$$

$$h\_{n-k-1,0}\mathbf{x}\_{u\_0} + h\_{n-k-1,1}\mathbf{x}\_{u\_1} + h\_{n-k-1,2}\mathbf{x}\_{u\_2} + \dots + h\_{n-k-1,n-s-1}\mathbf{x}\_{u\_{n-s-1}} = \mathbf{0}$$

Further to this, the hypothetical case is considered where there is an additional erased bit *xfs* . This bit coordinate is clearly one of the previously unerased bit coordinates, denoted as *xup* .

$$\mathbf{x}\_{f\_x} = \mathbf{x}\_{u\_p}$$

Also, in this case it is considered that these *s*+1 erased coordinates do not correspond to *s* + 1 independent columns of the **H** matrix, but only to *s* + 1 dependent columns. This means that *xup* is not contained in any of the *n* − *k* − *s* remaining parity-check equations, and cannot be solved as the additional erased bit.

For the first *s* erased bits whose coordinates do correspond to *s* independent columns of the **H** matrix, the set of codewords is considered in which all of the unerased coordinates are equal to zero except for *xup* . In this case the parity-check equations above are simplified to become:

$$\begin{aligned} x\_{\mathfrak{f}\_0} &= h\_{0,p} \chi\_{\mathfrak{u}\_p} \\ x\_{\mathfrak{f}\_1} &= h\_{1,p} \chi\_{\mathfrak{u}\_p} \\ x\_{\mathfrak{f}\_2} &= h\_{2,p} \chi\_{\mathfrak{u}\_p} \\ \dots &= \dots \\ \dots &= \dots \\ \chi\_{\mathfrak{f}\_{\mathfrak{f}\_{\mathfrak{f}\_{\mathfrak{p}}}}} &= h\_{\mathfrak{z}-1,p} \chi\_{\mathfrak{u}\_p} \end{aligned}$$

As there are, by definition, at least *n* − *s* − 1 zero coordinates contained in each codeword, the maximum weight of any of the codewords above is*s*+1. Furthermore, any erased coordinate that is zero may be considered as an unsolved coordinate, since no non-zero coordinate is a function of this coordinate. This leads to the following theorem.

**Theorem 1** *The non-zero coordinates of a codeword of weight w that is not the juxtaposition of two or more lower weight codewords, provide the coordinate positions of w* − 1 *erasures that can be solved and provide the coordinate positions of w erasures that cannot be solved.*

*Proof* The coordinates of a codeword of weight *w* must satisfy the equations of the parity-check matrix. With the condition that the codeword is not constructed from the juxtaposition of two or more lower weight codewords, the codeword must have *w* − 1 coordinates that correspond to linearly independent columns of the **H** matrix and *w* coordinates that correspond to linearly dependent columns of the **H** matrix.

**Corollary 1** *Given s coordinates corresponding to an erasure pattern containing s erasures, s* ≤ (*n* − *k*)*, of which w coordinates are equal to the non-zero coordinates of a single codeword of weight w, the maximum number of erasures that can be corrected is s* − 1 *and the minimum number that can be corrected is w* − 1*.*

**Corollary 2** *Given w* − 1 *coordinates that correspond to linearly independent columns of the* **H** *matrix and w coordinates that correspond to linearly dependent columns of the* **H** *matrix, a codeword can be derived that has a weight less than or equal to w.*

The weight enumeration function of a code [15] is usually described as a homogeneous polynomial of degree *n* in *x* and *y*.

$$W\_{\nparallel}(\mathbf{x}, \mathbf{y}) = \sum\_{i=0}^{n-1} A\_i \mathbf{x}^{n-i} \mathbf{y}^i$$

The support of a codeword is defined [15] as the coordinates of the codeword that are non-zero. The probability of the successful erasure correction of*s* or more erasures is equal to the probability that no subset of the *s* erasure coordinates corresponds to the support of any codeword.

The number of possible erasure patterns of *s* erasures of a code of length *n* is *<sup>n</sup> s* . For a *single* codeword of weight *w*, the number of erasure patterns with *s* coordinates that include the support of this codeword is *<sup>n</sup>*−*<sup>w</sup> s*−*w* . Thus, the probability of a subset of the *s* coordinates coinciding with the support of a single codeword of weight *w*, *prob*(**xw** ∈ **fs**) is given by:

$$prob(\mathbf{x\_w} \in \mathbf{f\_s}) = \frac{\binom{n-w}{x-w}}{\binom{n}{s}}$$

and

$$prob(\mathbf{x\_w} \in \mathbf{f\_s}) = \frac{(n-w)! \text{ s! } (n-s)!}{n! \text{ } (s-w)! \text{ } (n-s)!}$$

simplifying

$$prob(\mathbf{x\_w} \in \mathbf{f\_s}) = \frac{(n-w)! \text{ s!}}{n! \text{ (s-w)!}}$$

In such an event the *s* erasures are uncorrectable because, for these erasures, there are not *s* independent parity-check equations [15, 16]. However, *s* − 1 erasures are correctable provided the *s* − 1 erasures do not contain the support of a lower weight codeword.

The probability that *s* erasures will contain the support of at least one codeword of any weight, is upper and lower bounded by

$$1 - \prod\_{j=d\_{\min}}^{s} 1 - A\_j \frac{(n-j)!s!}{n!(s-j)!} < P\_s \le \sum\_{j=d\_{\min}}^{s} A\_j \frac{(n-j)!s!}{n!(s-j)!} \tag{14.2}$$

And given *s* + 1 erasures, the probability that exactly *s* erasures are correctable, *Pr*(*s*) is given by

$$Pr(\mathbf{s}) = P\_{\mathbf{s}+1} - P\_{\mathbf{s}} \tag{14.3}$$

Given up to *n* − *k* erasures the average number of erasures correctable by the code is

$$\overline{N\_{\epsilon}} = \sum\_{s=d\_{\min}}^{n-k} sPr(s) = \sum\_{s=d\_{\min}}^{n-k} s(P\_{s+1} - P\_s) \,. \tag{14.4}$$

Carrying out the sum in reverse order and noting that *Pn*−*k*+<sup>1</sup> = 1, the equation simplifies to become

$$\overline{N\_{\varepsilon}} = (n - k) - \sum\_{s=d\_{\text{min}}}^{n-k} P\_s \tag{14.5}$$

An MDS code can correct *n* − *k* erasures and is clearly the maximum number of correctable erasures as there are only *n* − *k* independent parity-check equations. It is useful to denote an MDS shortfall

$$\text{MDS}\_{\text{shortfall}} = \sum\_{s=d\_{\text{min}}}^{n-k} P\_s \tag{14.6}$$

and

$$N\_e = (n - k) - \text{MDS}\_{\text{shortfall}} \tag{14.7}$$

with

$$\sum\_{s=d\_{\min}}^{n-k} 1 - \prod\_{j=d\_{\min}}^{s} 1 - A\_j \frac{(n-j)!s!}{n!(s-j)!} < \text{MDS}\_{\text{shortfall}} \tag{14.8}$$

and

$$\text{MDS}\_{\text{shortfall}} < \sum\_{s=d\_{\text{min}}}^{n-k} \sum\_{j=d\_{\text{min}}}^{s} A\_j \frac{(n-j)!s!}{n!(s-j)!} \tag{14.9}$$

The contribution made by the high multiplicity of low-weight codewords to the shortfall in MDS performance is indicated by the probability *P*ˆ*<sup>j</sup>* that the support of at least one codeword of weight *j* is contained in *s* erasures averaged over the number of uncorrectable erasures *s*, from *s* = *dmin* to *n* − *k*, and is given by

$$\hat{P}\_j = \sum\_{s=d\_{\min}}^{n-k} Pr(s-1) A\_j \frac{(n-j)!s!}{n!(s-j)!} \tag{14.10}$$

#### **14.3 Probability of Decoder Error**

For the erasure channel with erasure probability *p*, the probability of codeword decoder error, *Pd* (*p*) for the code may be derived in terms of the weight spectrum of the code assuming ML decoding. It is assumed that a decoder error is declared if more than *n* − *k* erasures occur and that the decoder does not resort to guessing erasures. The probability of codeword decoder error is given by the familiar function of *p*.

$$P\_d(p) = \sum\_{s=1}^{n} P\_s p^s (1-p)^{(n-s)} \tag{14.11}$$

Splitting the sum into two parts

$$P\_d(p) = \sum\_{s=1}^{n-k} P\_s p^s (1-p)^{(n-s)} + \sum\_{s=n-k+1}^n P\_s p^s (1-p)^{(n-s)} \tag{14.12}$$

The second term gives the decoder error rate performance for a hypothetical MDS code and the first term represents the degradation of the code compared to an MDS code. Using the upper bound of Eq. (14.2),

$$P\_d(p) \le \sum\_{s=1}^{n-k} \sum\_{j=1}^s A\_j \frac{(n-j)!}{n!} \frac{s!}{(n-s)!} \frac{n!}{(n-s)!} p^s (1-p)^{(n-s)}$$

$$+ \sum\_{s=n-k+1}^n \frac{n!}{(n-s)!} p^s (1-p)^{(n-s)}\tag{14.13}$$

As well as determining the performance shortfall, compared to MDS codes, in terms of the number of correctable erasures it is also possible to determine the loss from capacity for the erasure channel. The capacity of the erasure channel with erasure probability *p* was originally determined by Elias [9] to be 1 − *p*. Capacity may be approached with zero codeword error for very long codes, even using non-MDS codes such as BCH codes [7]. However, short codes and even MDS codes, will produce a non-zero frame error rate (FER). For (*n*, *k*, *n* − *k* + 1) MDS codes, a codeword decoder error is deemed to occur whenever there are more than *n* − *k* erasures. (It is assumed here that the decoder does not resort to guessing erasures that cannot be solved). This probability, *PMDS*(*p*), is given by

$$P\_{MDS}(p) = 1 - \sum\_{s=0}^{n-k} \frac{n!}{(n-s)!} p^s (1-p)^{(n-s)} \tag{14.14}$$

The probability of codeword decoder error for the code may be derived from the weight enumerator of the code using Eq. (14.13).

$$P\_{code}(p) = \sum\_{s=d\_{min}}^{n-k} \sum\_{j=d\_{min}}^{s} \left( A\_j \frac{(n-j)!}{n!} \frac{s!}{(n-s)!} \frac{n!}{s!} p^s (1-p)^{(n-s)} \right.$$

$$+ \sum\_{s=n-k+1}^{n} \frac{n!}{(n-s)!} p^s (1-p)^{(n-s)} \right) \tag{14.15}$$

This simplifies to become

$$P\_{code}(p) = \sum\_{s=d\_{min}}^{n-k} \sum\_{j=d\_{min}}^{s} A\_j \frac{(n-j)! \ (n-s)!}{(s-j)!} p^s (1-p)^{(n-s)} + P\_{MDS}(p) \tag{14.16}$$

The first term in the above equation represents the loss from MDS code performance.

# **14.4 Codes Whose Weight Enumerator Coefficients Are Approximately Binomial**

It is well known that the distance distribution for many linear, binary codes including BCH codes, Goppa codes, self-dual codes [13–15, 19] approximates to a binomial distribution. Accordingly,

$$A\_j \approx \frac{n!}{(n-j)!j!\,2^{n-k}}\tag{14.17}$$

For these codes, for which the approximation is true, the shortfall in performance

compared to an MDS code, *MDSshortfall* is obtained by substitution into Eq. (14.9)

$$\text{MDS}\_{\text{short\\_fall}} = \sum\_{s=1}^{n-k} \sum\_{j=1}^{s} \frac{n!}{(n-j)!j!} \frac{(n-j)!}{n!} \frac{s!}{(s-j)!} \tag{14.18}$$

which simplifies to

$$\text{MDS}\_{shortfall} = \sum\_{s=1}^{n-k} \frac{2^s - 1}{2^{n-k}} \tag{14.19}$$

which leads to the simple result

$$\text{MDS}\_{short\text{fall}} = 2 - \frac{n - k - 2}{2^{n - k}} \approx 2 \tag{14.20}$$

It is apparent that for these codes the MDS shortfall is just 2 bits from correcting all *n* − *k* erasures. It is shown later using the actual weight enumerator functions for codes, where these are known, that this result is slightly pessimistic since in the above analysis there is a non-zero number of codewords with distance less than *dmin*. However, the error attributable to this is quite small. Simulation results for these codes show that the actual MDS shortfall is closer to 1.6 bits due to the assumption that there is never an erasure pattern which has the support of more than one codeword.

For these codes whose weight enumerator coefficients are approximately binomial, the probability of the code being able to correct exactly *s* erasures, but no more, may also be simplified from (14.2) and (14.3).

$$Pr(s) = \sum\_{j=1}^{s+1} \frac{n!}{(n-j)!j!} \frac{(n-j)!}{n!} \frac{(s-j)!}{(s+1-j)!}$$

$$-\sum\_{j=1}^{s} \frac{n!}{(n-j)!j!} \frac{(n-j)!}{n!} \frac{s!}{(s-j)!} \tag{14.21}$$

which simplifies to become

$$Pr(\mathbf{s}) = \frac{2^s - 1}{2^{n-k}} \tag{14.22}$$

for *s* < *n* − *k* and for *s* = *n* − *k*

$$Pr(n-k) = 1 - \sum\_{j=1}^{n-k} \frac{n!}{(n-j)!j!} \frac{(n-j)!}{n!} \frac{(n-k)!}{(n-k-j)!} \tag{14.23}$$

and


$$Pr(n-k) = \frac{1}{2^{n-k}}\tag{14.24}$$

For codes whose weight enumerator coefficients are approximately binomial, the pdf of correctable erasures is given in Table 14.1.

The probability of codeword decoder error for these codes is given by substitution into (14.15),

$$P\_{code}(p) = \sum\_{s=0}^{n-k} \left(\frac{2^s - 1}{2^{n-k}}\right) \frac{n!}{(n-s)!} p^r (1-p)^{(n-s)} + P\_{MDS}(p) \tag{14.25}$$

As first shown by Dumer and Farrell [7] as *n* is taken to ∞, these codes achieve the erasure channel capacity. As examples, the probability of codeword decoder error for hypothetical rate 0.9 codes, having binomial weight distributions, and lengths 100 to 10,000 bits are shown plotted in Fig. 14.1 as a function of the channel erasure probability expressed in terms of relative erasure channel capacity <sup>0</sup>.<sup>9</sup> <sup>1</sup>−*<sup>p</sup>* . It can be seen that at a decoder error rate of 10−<sup>8</sup> the (1000, 900) code is operating at 95% of channel capacity, and the (10,000, 9,000) code is operating at 98% of channel capacity. A comparison with MDS codes is shown in Fig. 14.2. For codelengths from 500 to 50,000 bits, it can be seen that for codelengths of 5,000 bits and above, these rate 0.9 codes are optimum since their performance is indistinguishable from the performance of MDS codes with the same length and rate.

A comparison of MDS codes to codes with binomial weight enumerator coefficients is shown in Fig. 14.3 for <sup>1</sup> <sup>2</sup> rate codes with code lengths from 128 to 1024.

**Fig. 14.1** FER performance of codes with binomial weight enumerator coefficients

**Fig. 14.2** Comparison of codes with binomial weight enumerator coefficients to MDS codes

**Fig. 14.3** Comparison of half rate codes having binomial weight enumerator coefficients with MDS codes as a function of erasure probability

# **14.5 MDS Shortfall for Examples of Algebraic, LDPC and Turbo Codes**

The first example is the extended BCH code (128, 99, 10) whose coefficients up to weight 30 of the weight enumerator polynomial [5] are tabulated in Table 14.2.


The PDF of the number of erased bits that are correctable up to the maximum of 29 erasures, derived from Eq. (14.1), is shown plotted in Fig. 14.4. Also shown plotted in Fig. 14.4 is the performance obtained numerically. It is straightforward, by computer simulation, to evaluate the erasure correcting performance of the code by generating a pattern of erasures randomly and solving these in turn using the parity-check equations. This procedure corresponds to maximum likelihood (ML) decoding [6, 17]. Moreover, the codeword responsible for any instances of non-MDS performance, (due to this erasure pattern) can be determined by back substitution into the solved parity-check equations. Except for short codes or very high rate codes, it is not possible to complete this procedure exhaustively, because there are too many combinations of erasure patterns. For example, there are 4.67 × 10<sup>28</sup> combinations of 29 erasures in this code of length 128 bits. In contrast, there are relatively few low-weight codewords responsible for the non-MDS performance of the code. For example, each codeword of weight 10 is responsible for <sup>118</sup> 19 = 4.13×10<sup>21</sup> erasures patterns not being solvable.

As the *dmin* of this code is 10, the code is guaranteed to correct any erasure pattern containing up to 9 erasures. It can be seen from Fig. 14.4 that the probability of not being able to correct any pattern of 10 erasures is less than 10−8. The probability of correcting 29 erasures, the maximum number, is 0.29. The average number of erasures corrected is 27.44, almost three times the *dmin*, and the average shortfall from MDS performance is 1.56 erased bits. The prediction of performance by the lower bound is pessimistic due to double codeword counting in erasure patterns featuring more than 25 bits or so. The effect of this is evident in Fig. 14.4. The lower bound average number of erasures corrected is 27.07, and the shortfall from MDS performance is 1.93 erasures, an error of 0.37 erasures. The erasure performance evaluation by simulation is complementary to the analysis using the weight distribution of the code, in that the simulation, being a sampling procedure, is inaccurate for short, uncorrectable erasure patterns, because few codewords are responsible for the performance in this region. For short, uncorrectable erasure patterns, the lower bound analysis is tight in this region because it not possible for these erasure patterns to contain more than one codeword due to codewords differing by at least *dmin*.

The distribution of the codeword weights responsible for non-MDS performance of this code is shown in Fig. 14.5.

This is in contrast to the distribution of low-weight codewords shown in Fig. 14.6. Although there are a larger number of higher weight codewords, there is less chance of an erasure pattern containing a higher weight codeword. The maximum occurrence is for weight 14 codewords as shown in Fig. 14.5.

The FER performance of the BCH (128, 107, 10) code is shown plotted in Fig. 14.7 as a function of relative capacity defined by (1−*p*)*<sup>n</sup> <sup>k</sup>* . Also, plotted in Fig. 14.7 is the FER performance of a hypothetical (128, 99, 30) MDS code. Equations (14.15) and (14.14), respectively, were used to derive Fig. 14.7. As may be seen from Fig. 14.7, there is a significant shortfall in capacity even for the optimum MDS code. This shortfall is attributable to the relatively short length of the code. At 10−<sup>9</sup> FER, the BCH (128, 99, 10) code achieves approximately 80% of the erasure channel capacity.

**Fig. 14.4** Erasure performance for the (128, 99, 10) Extended BCH Code

**Fig. 14.5** Distribution of codeword weights responsible for non-MDS performance, of the (128, 99, 10) BCH Code

**Fig. 14.6** Distribution of low-weight codewords for the (128, 99, 10) BCH code

**Fig. 14.7** FER performance for the (128, 99, 10) BCH code for the erasure channel


**Table 14.3** Spectral terms up to weight 50 for the extended BCH (256, 207) code

The maximum capacity achievable by any (128, 99) binary code as represented by a (128, 99, 30) MDS code is approximately 82.5%.

An example of a longer code is the (256, 207, 14) extended BCH code. The coefficients up to weight 50 of the weight enumerator polynomial [10] are tabulated in Table 14.3. The evaluated erasure correcting performance of this code is shown in Fig. 14.8, and the code is able to correct up to 49 erasures. It can be seen from Fig. 14.8 that there is a close match between the lower bound analysis and the simulation results for the number of erasures between 34 and 46. Beyond 46 erasures, the lower bound becomes increasingly pessimistic due to double counting of codewords. Below 34 erasures the simulation results are erratic due to insufficient samples. It can be seen from Fig. 14.8 that the probability of correcting only 14 erasures is less than 10−<sup>13</sup> (actually 5.4 × 10−14) even though the *dmin* of the code is 14. If a significant level of erasure correcting failures is defined as 10−6, then from Fig. 14.8, this code is capable of correcting up to 30 erasures even though the guaranteed number of correctable erasures is only 13. The average number of erasures correctable by the code is 47.4, an average shortfall of 1.6 erased bits. The distribution of codeword weights responsible for the non-MDS performance of this code is shown in Fig. 14.9.

The FER performance of the BCH (256, 207, 14) code is shown plotted in Fig. 14.10 as a function of relative capacity defined by (1−*p*)*<sup>n</sup> <sup>k</sup>* . Also plotted in

**Fig. 14.8** PDF of erasure corrections for the (256, 207, 14) Extended BCH Code

**Fig. 14.9** Distribution of codeword weights responsible for non-MDS performance, for the extended (256, 207, 14) BCH Code

**Fig. 14.10** FER performance for the (256, 207, 14) BCH Code for the erasure channel

Fig. 14.10 is the FER performance of a hypothetical (256, 207, 50) MDS code. Equations (14.15) and (14.14), respectively, were used to derive Fig. 14.10. As may be seen from Fig. 14.10, there is less of a shortfall in capacity compared to the BCH (128, 107, 10) code. At 10−<sup>9</sup> FER, the BCH (256, 207, 14) code achieves approximately 85.5% of the erasure channel capacity. The maximum capacity achievable by any (256, 207) binary code as represented by the (256, 207, 50) hypothetical MDS code is approximately 87%.

The next code to be investigated is the (512, 457, 14) extended BCH code which was chosen because it is comparable to the (256, 207, 14) code in being able to correct a similar maximum number of erasures (55 *cf.* 49) and has the same *dmin* of 14. Unfortunately, the weight enumerator polynomial has yet to be determined, and only erasure simulation results may be obtained. Figure 14.11 shows the performance of this code. The average number of erasures corrected is 53.4, an average shortfall of 1.6 erased bits. The average shortfall is identical to the (256, 207, 14) extended BCH code. Also, the probability of achieving MDS code performance, i.e. being able to correct all *n* − *k* erasures is also the same and equal to 0.29. The distribution of codeword weights responsible for non-MDS performance of the (512, 457, 14) code is very similar to that for the (256, 207, 14) code, as shown in Fig. 14.12.

An example of an extended cyclic quadratic residue code is the (168, 84, 24) code whose coefficients of the weight enumerator polynomial have been recently determined [20] and are tabulated up to weight 72 in Table 14.4. This code is a self-dual, doubly even code, but not extremal because its *dmin* is not 32 but 24 [3]. The FER performance of the (168, 84, 24) code is shown plotted in Fig. 14.13 as

**Fig. 14.11** PDF of erasure corrections for the (512, 457, 14) Extended BCH Code

**Fig. 14.12** Distribution of codeword weights responsible for non-MDS performance, for the extended (512, 457, 14) BCH Code


**Fig. 14.13** FER performance for the (168, 84, 24) eQR Code for the erasure channel

a function of relative capacity defined by (1−*p*)*<sup>n</sup> <sup>k</sup>* . Also plotted in Fig. 14.13 is the FER performance of a hypothetical (168, 84, 85) MDS code. Equations (14.15) and (14.14), respectively, were used to derive Fig. 14.13. The performance of the (168, 84, 24) code is close to that of the hypothetical MDS code but both codes are around 30% from capacity at 10−<sup>6</sup> FER.

The erasure correcting performance of non-algebraic designed codes is quite different from algebraic designed codes as may be seen from the performance results for a (240, 120, 16) turbo code shown in Fig. 14.14. The turbo code features memory 4 constituent recursive encoders and a code matched, modified S interleaver, in order to maximise the *dmin* of the code. The average number of erasures correctable by the code is 116.5 and the average shortfall is 3.5 erased bits. The distribution of codeword weights responsible for non-MDS performance of the (240, 120, 16) code is very different from the algebraic codes and features a flat distribution as shown in Fig. 14.15.

Similarly, the erasure correcting performance of a (200, 100, 11) LDPC code designed using the Progressive Edge Growth (PEG) algorithm [12] is again quite different from the algebraic codes as shown in Fig. 14.16. As is typical of randomly generated LDPC codes, the *dmin* of the code is quite small at 11, even though the code has been optimised. For this code, the average number of correctable erasures is 93.19 and the average shortfall is 6.81 erased bits. This is markedly worse than the turbo code performance. It is the preponderance of low-weight codewords that is responsible for the inferior performance of this code compared to the other codes as shown by the codeword weight distribution in Fig. 14.17.

The relative weakness of the LDPC code and turbo code becomes clear when compared to a good algebraic code with similar parameters. There is a (200, 100, 32) extended quadratic residue code. The p.d.f. of the number of erasures corrected by this code is shown in Fig. 14.18. The difference between having a *dmin* of 32 compared to 16 for the turbo code and 10 for the LDPC code is dramatic. The average number of correctable erasures is 98.4 and the average shortfall is 1.6 erased bits. The weight enumerator polynomial of this self-dual code, is currently unknown as evaluation of the 2<sup>100</sup> codewords is currently beyond the reach of today's computers. However, the distribution of codeword weights responsible for non-MDS performance of the (200, 100, 32) code which is shown in Fig. 14.19 indicates the doubly even codewords of this code and the *dmin* of 32.

# *14.5.1 Turbo Codes with Dithered Relative Prime (DRP) Interleavers*

DRP interleavers were introduced in [4]. They have been shown to produce some of the largest minimum distances for turbo codes. However, the iterative decoding algorithm does not exploit this performance to its full on AWGN channels where the performance of these interleavers is similar to that of randomly designed interleavers having lower minimum distance. This is due to convergence problems in the error floor region. A DRP interleaver is a concatenation of 3 interleavers, the two dithers *A*, *B* and a relative prime interleaver π:

$$I(i) = B(\pi(A(i)))\tag{14.26}$$

**Fig. 14.14** PDF of erasure corrections for the (240, 120, 16) turbo code

**Fig. 14.15** Distribution of codeword weights responsible for non-MDS performance, for the (240, 120, 16) turbo code

**Fig. 14.16** PDF of erasure corrections for the (200, 100, 10) PEG LDPC code

**Fig. 14.17** Distribution of codeword weights responsible for non-MDS performance, for the (200, 100, 10) PEG LDPC code

**Fig. 14.18** PDF of erasure corrections for the (200, 100, 32) Extended QR Code

**Fig. 14.19** Distribution of codeword weights responsible for non-MDS performance, for the (200, 100, 32) Extended QR Code


The dithers are short permutations, generally of length *m* = 4, 8, 16 depending on the length of the overall interleaver. We have

$$A(i) = m\left\lfloor i/m \right\rfloor + a\_{i\%m} \tag{14.27}$$

$$B(i) = m\left\lfloor i/m \right\rfloor + b\_{i\%m} \tag{14.28}$$

$$
\pi(\bar{i}) = (p\bar{i} + q)\% m,\tag{14.29}
$$

where *a*, *b*, are permutations of length *m* and *p* must be relatively prime to *k*. If *a*, *b* and *p* are properly chosen, the minimum distance of turbo codes can be drastically improved as compared to that of a turbo code using a typical *S* -random interleaver. A comparison is shown in Table 14.5 for memory 3 component codes.

As an example two turbo codes are considered, one employing a DRP interleavers, having parameters (120, 40, 19) and another employing a typical *S* -random interleaver and having parameters (120, 40, 13).

#### *14.5.2 Effects of Weight Spectral Components*

The weight spectrum of each of the two turbo codes has been determined exhaustively from the G matrix of each code by codeword enumeration using the revolving door algorithm. The weight spectrum of both of the two turbo codes is shown in Table 14.6. It should be noted that as the codes include the all ones codeword, *An*−*<sup>j</sup>* = *Aj*, only weights up to *A*<sup>60</sup> are shown in Table 14.6.

Using the weight spectrum of each code the upper and lower bound cumulative distributions and corresponding density functions have been derived using Eqs. (14.2) and (14.3), respectively, and are compared in Fig. 14.20. It can be observed that the DRP interleaver produces a code with a significantly smaller probability of failing to correct a given number of erasures.

The MDS shortfall for the two codes is:

$$\text{MDS}\_{\text{shortfall}}(120, 40, 19) = 2.95 \text{ bits} \tag{14.30}$$

$$\text{MDS}\_{\text{shortfall}}(120, 40, 13) = 3.29 \text{ bits} \tag{14.31}$$

The distribution of the codeword weights responsible for the MDS shortfalls is shown in Fig. 14.21. For interest, also shown in Fig. 14.21 is the distribution for


**Table 14.6** Weight spectrum of the (120, 40, 19) and (120, 40, 13) turbo codes. Multiplicity for weights larger than 60 satisfy *A*60−*<sup>i</sup>* = *A*60+*<sup>i</sup>*


**Table 14.6** (continued)

**Fig. 14.20** Probability of Maximum Likelihood decoder failure

(120, 40, 28) best known linear code. This code, which is chosen to have the same block length and code rate as the turbo code, is derived by shortening a (130, 50, 28) code obtained by adding two parity checks to the (128, 50, 28) extended BCH. This linear code has an MDS shortfall of 1.62 bits and its weight spectrum consists of doubly even codewords as shown in Table 14.7. For the turbo codes the contribution made by the lower weight codewords is apparent in Fig. 14.21, and this is confirmed by the plot of the cumulative contribution made by the lower weight codewords shown in Fig. 14.22.


**Fig. 14.21** Distribution of codeword weights responsible for non-MDS performance

**Fig. 14.22** Cumulative code weight contribution to MDS shortfall

**Fig. 14.23** Probability of ML decoder error for the erasure channel

For the erasure channel, the performance of the two turbo codes and the (120, 40, 28) code is given by (14.15) and is shown in Fig. 14.23 assuming ML decoding. Also shown in Fig. 14.23 is the performance of a (hypothetical) binary (120, 40, 81) MDS which is given by the second term of (14.15). The code derived from the shortened, extended BCH code, (120, 40, 28), has the best performance and compares well to the lower bound provided by the MDS hypothetical code. The DRP interleaver turbo code also has good performance, but the *S* -random interleaver turbo code shows an error floor due to the *dmin* of 13.

# **14.6 Determination of the** *dmin* **of Any Linear Code**

It is well known that the determination of weights of any linear code is a Nondeterministic Polynomial time (NP) hard problem [8] and except for short codes, the best methods for determining the minimum Hamming distance, *dmin* codeword of a linear code, to date, are probabilistically based [2]. Most methods are based on the generator matrix, the G matrix of the code and tend to be biased towards searching using constrained information weight codewords. Such methods become less effective for long codes or codes with code rates around <sup>1</sup> <sup>2</sup> because the weights of the evaluated codewords tend to be binomially distributed with average weight *<sup>n</sup>* <sup>2</sup> [15].

Corollary 2 from Sect. 14.2 above, provides the basis of a probabilistic method to find low-weight codewords in a significantly smaller search space than the G matrix methods. Given an uncorrectable erasure pattern of *n*−*k* erasures, from Corollary 2, the codeword weight is less than or equal to *n* − *k*. The search method suggested by this, becomes one of randomly generating erasure patterns of *n* − *k* + 1 erasures, which of course are uncorrectable by any (n,k) code, and determining the codeword and its weight from (14.2). This time, the weights of the evaluated codewords will tend to be binomially distributed with average weight *<sup>n</sup>*−*k*+<sup>1</sup> <sup>2</sup> . With this trend, for *Ntrials* the number of codewords determined with weight *d*, *Md* is given by

$$M\_d = N\_{\text{trials}} \frac{(n-k+1)!}{d!(n-k-d+1)!2^{n-k+1}} \tag{14.32}$$

As an example of this approach, the self-dual, bordered, double-circulant code (168, 84) based on the prime number 83, is described in [11] as having an unconfirmed *dmin* of 28. From (14.32) when using 18,000 trials, 10 codewords of weight 28 will be found on average. However, as the code is doubly even and only has codewords weights which are a multiple of 4, using 18,000 trials, 40 codewords are expected. In a set of trials using this method for the (168, 84) code, 61 codewords of weight 28 were found with 18,000 trials. Furthermore, 87 codewords of weight 24 were also found indicating that the *dmin* of this code is 24 and not 28 as was originally expected in [11].

The search method can be improved by biasing towards the evaluation of erasure patterns that have small numbers of erasures that cannot be solved. Recalling the analysis in Sect. 14.2, as the parity-check equations are Gaussian reduced, no erased bit is a function of any other erased bits. There will be *n* − *k* − *s* remaining paritycheck equations, which do not contain the erased bit coordinates **xf**. These remaining equations may be searched to see if there is an unerased bit coordinate, that is not present in any of the equations. If there is one such coordinate, then this coordinate in conjunction with the erased coordinates solved so far forms an uncorrectable erasure pattern involving only *s* erasures instead of *n* − *k* + 1 erasures. With this procedure, biased towards small numbers of unsolvable erasures, it was found that, for the above code, 21 distinct codewords of weight 24 and 17 distinct codewords of weight 28 were determined in 1000 trials and the search took approximately 2 s on a typical 2.8GHz, Personal Computer (PC).

In another example, the (216, 108) self dual, bordered double-circulant code is given in [11] with an unconfirmed *dmin* of 36. With 1000 trials which took 7 s on the PC, 11 distinct codewords were found with weight 24 and a longer evaluation confirmed that the *dmin* of this code is indeed 24.

#### **14.7 Summary**

Analysis of the erasure correcting performance of linear, binary codes has provided the surprising result that many codes can correct, on average, almost *n*−*k* erasures and have a performance close to the optimum performance as represented by (hypothetical), binary MDS codes. It was shown that for codes having a weight distribution approximating to a binomial distribution, and this includes many common codes, such as BCH codes, Goppa codes and self-dual codes, that these codes can correct at least *n*−*k* −2 erasures on average, and closely match the FER performance of MDS codes as code lengths increase. The asymptotic performance achieves capacity for the erasure channel. It was also shown that codes designed for iterative decoders, the turbo and LDPC codes, are relatively weak codes for the erasure channel and compare poorly with algebraically designed codes. Turbo codes, designed for optimised *dmin*, were found to outperform LDPC codes.

For turbo codes using DRP interleavers for the erasure channel using ML decoding, the result is that these relatively short turbo codes are (on average), only about 3 erasures away from optimal MDS performance. The decoder error rate performance of the two turbo codes when using ML decoding on the erasure channel was compared to (120, 40, 28) best known linear code and a hypothetical binary MDS code. The DRP interleaver demonstrated a clear advantage over the *S* -random interleaver and was not too far way from MDS performance. Analysis of the performance of longer turbo codes is rather problematic.

Determination of the erasure correcting performance of a code provides a means of determining the *dmin* of a code and an efficient search method was described. Using the method, the *dmin* results for two self-dual codes, whose *dmin* values were previously unknown were determined, and these codes were found to be (168, 84, 24) and (216, 108, 24) codes.

# **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the book's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the book's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Chapter 15 The Modified Dorsch Decoder**

#### **15.1 Introduction**

In a relatively unknown paper published in 1974, Dorsch [4] described a decoder for linear binary block (*n*, *k*) codes using soft decisions quantised to *J* levels. The decoder is applicable to any linear block code and does not rely upon any particular features of the code, such as being a concatenated code or having a sparse parity-check matrix. In the Dorsch decoder, hard decisions are derived from the soft decisions using standard bit by bit detection, choosing the binary state closest to the received coordinate. The hard decisions are then ranked in terms of their likelihoods and candidate codewords are derived from a set of *k*, independent, most likely bits. This is done by producing a new parity-check matrix **HI** obtained by reordering the columns of the original **H** matrix according to the likelihood of each coordinate, and reducing the resulting matrix to echelon canonical form by elementary row operations. After evaluation of several candidate codewords, the codeword with the minimum soft decision metric is output from the decoder. A decoder using a similar principle, but without soft decision quantisation, has been described by Fossorier [5, 6]. Other approaches, after ranking the reliability of the received bits, adopt various search strategies for finding likely codewords [11] or utilise a hard decision decoder in conjunction with a search for errors in the least likely bit positions [2, 15].

The power of the Dorsch decoder arises from the relatively unknown property that most codes, *on average*, can correct almost *n* − *k* erasures [17], which is considerably more than the guaranteed number of correctable erasures of *dmin* − 1, or the guaranteed number of correctable hard decision errors of *dmin*−<sup>1</sup> <sup>2</sup> , where *dmin* is the minimum Hamming distance of the code. In its operation, the Dorsch decoder needs to correct any combination of *n* − *k* erasures which is impossible unless the code is an MDS code [12]. Dorsch did not discuss this problem, or potential solutions, in his original paper [4], although at least one solution is implied by the results he presented.

In this chapter, a solution to the erasure correcting problem of being able to solve *n* − *k* erasures for a non-MDS code is described. It is based on using alternative columns of the parity-check matrix without the need for column permutations. It is also shown that it is not necessary to keep recalculating each candidate codeword and its associated soft decision metric in order to find the most likely codeword. Instead, an incremental correlation approach is adopted which features low information weight codewords and a correlation function involving only a small number of coordinates of the received vector [17]. It is proven that maximum likelihood decoding is realised provided all codewords are evaluated up to a bounded information weight. This means that maximum likelihood decoding may be achieved for a high percentage of received vectors. The decoder lends itself to a low complexity, parallel implementation involving a concatenation of hard and soft decision decoding. It produces near maximum likelihood decoding for codes that can be as long as 1000 bits, provided the code rate is high enough. When implementing the decoder, it is shown that complexity may be traded-off against performance in a flexible manner. Decoding results, achieved by the decoder, are presented for some of the most powerful binary codes known and compared to Shannon's sphere packing bound [14].

The extension to non-binary codes is straightforward and this is described in Sect. 15.5.

#### **15.2 The Incremental Correlation Dorsch Decoder**

Codewords with binary coordinates having state 0 or 1, are denoted as:

$$\mathbf{x} = (\mathbf{x}\_0, \mathbf{x}\_1, \mathbf{x}\_2, \dots, \mathbf{x}\_{n-1})$$

For transmission, bipolar transmission is used with coordinates having binary state 0 mapped to +1 and having state 1 mapped to −1. Transmitted codewords are denoted as

$$\mathbf{c} = (c\_0, c\_1, c\_2, \dots, c\_{n-1})$$

The received vector **r** consists of *n* coordinates (*r*0,*r*1,*r*2,...,*rn*−<sup>1</sup>) equal to the transmitted codeword plus Additive White Gaussian Noise with variance σ2. The received vector processed by the decoder is assumed to have been matched filtered and free from distortion so that <sup>1</sup> <sup>σ</sup><sup>2</sup> <sup>=</sup> <sup>2</sup>*Eb No* , where *Eb* is the energy per information bit and *No* is the single sided noise power spectral density. Accordingly,

$$
\sigma^2 = \frac{N\_o}{2E\_b}
$$

The basic principle that is used is that the *k* most reliable bits of the received vector are initially taken as correct and the *n* − *k* least reliable bits are treated as erasures. The parity-check equations of the code, as represented by **H**, are used to solve for these erased bits and a codeword **x**ˆ is obtained. This codeword is either equal to the transmitted codeword or needs only small changes to produce a codeword equal to the transmitted codeword. One difficulty is that, depending on the code, the *n* − *k* least reliable bits usually cannot all be solved as erasures. This depends on the positions of the erased coordinates and the power of the code. Only Maximum Distance Separable (MDS) codes [12] are capable of solving *n* − *k* erasures regardless of the positions of the erasures in the received codeword. Unfortunately, there are no binary MDS codes apart from trivial examples. However, a set of *n* − *k* erasures can always be solved from *n* − *k* + *s* least reliable bit positions, and, depending on the code, s is usually a small integer. In order to obtain best performance it is important that the very least reliable bit positions are solved first, since the corollary that the *n* − *k* least reliable bits usually cannot all be solved as erasures is that the *k* most reliable bits, used to derive codeword **x**ˆ, must include a small number of least reliable bits. However, for most received vectors, the difference in reliability between ranked bit *k* and ranked bit *k* + *s*is usually small. For any received coordinate, the a priori log likelihood ratio of the bit being correct is proportional to |*ri*|. The received vector **r** with coordinates ranked in order of most likely to be correct is defined as (*r*<sup>μ</sup><sup>0</sup> ,*r*<sup>μ</sup><sup>1</sup> ,*r*<sup>μ</sup><sup>2</sup> ,...,*r*<sup>μ</sup>*n*−<sup>1</sup> ), where |*r*<sup>μ</sup><sup>0</sup> | > |*r*<sup>μ</sup><sup>1</sup> | > |*r*<sup>μ</sup><sup>2</sup> | > ··· > |*r*<sup>μ</sup>*n*−<sup>1</sup> |.

The decoder is most straightforward for a binary MDS code. The codeword coordinates (*x*<sup>μ</sup><sup>0</sup> , *x*<sup>μ</sup><sup>1</sup> , *x*<sup>μ</sup><sup>2</sup> ,..., *x*<sup>μ</sup>*k*−<sup>1</sup> ) are formed directly from the received vector **r** using the bitwise decision rule *x*<sup>μ</sup>*<sup>i</sup>* = 1 if *r*<sup>μ</sup>*<sup>i</sup>* < 0 else *x*<sup>μ</sup>*<sup>i</sup>* = 0. The *n* − *k* coordinates (*x*<sup>μ</sup>*<sup>k</sup>* , *x*<sup>μ</sup>*k*+<sup>1</sup> , *x*<sup>μ</sup>*k*+<sup>2</sup> ,..., *x*<sup>μ</sup>*n*−<sup>1</sup> ) are considered to be erased and derived from the *k* most reliable codeword coordinates (*x*<sup>μ</sup><sup>0</sup> , *x*<sup>μ</sup><sup>1</sup> , *x*<sup>μ</sup><sup>2</sup> ,..., *x*<sup>μ</sup>*k*−<sup>1</sup> ) using the parity-check equations.

For a non-MDS code, the *n* − *k* coordinates cannot always be solved from the parity-check equations because the parity-check matrix is not a Cauchy or Vandermonde matrix [12]. To get around this problem a slightly different order is defined (*x*<sup>η</sup><sup>0</sup> , *x*<sup>η</sup><sup>1</sup> , *x*<sup>η</sup><sup>2</sup> ,..., *x*<sup>η</sup>*n*−<sup>1</sup> ).

The label of the last coordinate η*<sup>n</sup>*−<sup>1</sup> is set equal to μ*<sup>n</sup>*−<sup>1</sup> and *x*<sup>η</sup>*n*−<sup>1</sup> solved first by flagging the first parity-check equation that contains *x*<sup>η</sup>*n*−<sup>1</sup> , and then subtracting this equation from all other parity-check equations containing *x*<sup>η</sup>*n*−<sup>1</sup> . Consequently, *x*<sup>η</sup>*n*−<sup>1</sup> is now only contained in one equation, the first flagged equation.

The label of the next coordinate η*<sup>n</sup>*−<sup>2</sup> is set equal to μ*<sup>n</sup>*−<sup>2</sup> and an attempt is made to solve *x*<sup>η</sup>*n*−<sup>2</sup> by finding an unflagged parity-check equation containing *x*<sup>η</sup>*n*−<sup>2</sup> . In the event that there is not an unflagged equation containing *x*<sup>η</sup>*n*−<sup>2</sup> , η*<sup>n</sup>*−<sup>2</sup> is set equal to μ*<sup>n</sup>*−<sup>3</sup> the label of the next most reliable bit, *x*<sup>μ</sup>*n*−<sup>3</sup> and the procedure repeated until an unflagged equation contains *x*<sup>η</sup>*n*−<sup>2</sup> . As before, this equation is flagged that it will be used to solve for *x*<sup>η</sup>*n*−<sup>2</sup> and is subtracted from all other unflagged equations containing *x*<sup>η</sup>*n*−<sup>2</sup> . The procedure continues until all of the *n* − *k* codeword coordinates *x*<sup>η</sup>*n*−<sup>1</sup> , *x*<sup>η</sup>*n*−<sup>2</sup> , *x*<sup>η</sup>*n*−<sup>3</sup> ,..., *x*<sup>η</sup>*<sup>k</sup>* have been solved and all *n* − *k* equations have been flagged. In effect, the least reliable coordinates are skipped if they cannot be solved. The remaining *k* ranked received coordinates are set equal to (*r*<sup>η</sup><sup>0</sup> ,*r*<sup>η</sup><sup>1</sup> ,*r*<sup>η</sup><sup>2</sup> ,...,*r*<sup>η</sup>*k*−<sup>1</sup> ) in most reliable order, where |*r*<sup>η</sup><sup>0</sup> | > |*r*<sup>η</sup><sup>1</sup> | > |*r*<sup>η</sup><sup>2</sup> | > ··· > |*r*<sup>η</sup>*n*−<sup>1</sup> | and (*x*<sup>η</sup><sup>0</sup> , *x*<sup>η</sup><sup>1</sup> , *x*<sup>η</sup><sup>2</sup> ,..., *x*<sup>η</sup>*k*−<sup>1</sup> ) determined using the bit decision rule *x*<sup>η</sup>*<sup>i</sup>* = 1 if *r*<sup>η</sup>*<sup>i</sup>* < 0 else *x*<sup>η</sup>*<sup>i</sup>* = 0. The flagged parity-check equations are in upper triangular form and have to be solved in reverse order starting with the last flagged equation. This equation gives the solution to *x*η*<sup>k</sup>* which is back substituted into the other equations and *x*η*k*+<sup>1</sup> is solved next, back substituted, and so on, with coordinate *x*η*n*−<sup>1</sup> solved last.

This codeword is denoted as **x**ˆ and the mapped version of the codeword is denoted as **c**ˆ.

As is well-known [13], the codeword most likely to be transmitted is the codeword, denoted as **x**˘, which has the smallest squared Euclidean distance, *D*(**x**˘), between the mapped codeword, **c**˘, and the received vector.

$$D(\check{\mathbf{x}}) = \sum\_{j=0}^{n-1} (r\_j - \check{c}\_j)^2$$

*D*(**x**˘) < *D*(**x**) for all other codewords **x**.

Equivalently **x**˘ is the codeword, after mapping, which has the highest cross correlation

$$Y(\check{\mathbf{x}}) = \sum\_{j=0}^{n-1} r\_j \times \check{c}\_j \tag{15.1}$$

*Y*(**x**˘) > *Y*(**x**) for all other codewords **x**.

The decoder may be simplified if the cross correlation function is used to compare candidate codewords. The cross correlation is firstly determined for the codeword **x**ˆ

$$Y(\hat{\mathbf{x}}) = \sum\_{j=0}^{n-1} r\_j \times \hat{c}\_j \tag{15.2}$$

It is interesting to make some observations about *Y*(**x**ˆ). Since the summation can be carried out in any order

$$Y(\hat{\mathbf{x}}) = \sum\_{j=0}^{n-1} r\_{\eta\_j} \times \hat{c}\_{\eta\_j} \tag{15.3}$$

and

$$Y(\hat{\mathbf{x}}) = \sum\_{j=0}^{k-1} r\_{\eta\_j} \times \hat{c}\_{\eta\_j} + \sum\_{j=k}^{n-1} r\_{\eta\_j} \times \hat{c}\_{\eta\_j} \tag{15.4}$$

Considering the first term

$$\sum\_{j=0}^{k-1} r\_{\eta\_{\parallel}} \times \hat{c}\_{\eta\_{\parallel}} = \sum\_{j=0}^{k-1} |r\_{\eta\_{\parallel}}| \tag{15.5}$$

This is because the sign of *c*ˆ<sup>η</sup>*<sup>j</sup>* equals the sign of *c*ˆ<sup>η</sup>*<sup>j</sup>* for *j* < *k*. Thus, this term is independent of the code and Eq. (15.4) becomes

$$Y(\hat{\mathbf{x}}) = \sum\_{j=0}^{k-1} |r\_{\eta\_j}| + \sum\_{j=k}^{n-1} r\_{\eta\_j} \times \hat{c}\_{\eta\_j} \tag{15.6}$$

Almost all of the *k* largest received coordinates (all of the *k* largest terms for an MDS code) are contained in the first term of Eq. (15.6) and this ensures that the codeword **x**ˆ, after mapping, has a high correlation with **r**.

A binary, (hard decision), received vector **b** may be derived from the received vector **r** using the bitwise decision rule *bj* = 1 if *rj* < 0, else *bj* = 0 for *j* = 0 to *n* − 1. It should be noted that in general the binary vector **b** is not a codeword.

It is useful to define a binary vector **z**ˆ as

$$
\hat{\mathbf{z}} = \mathbf{b} \oplus \hat{\mathbf{x}}\tag{15.7}
$$

The maximum attainable correlation *Ymax* is given by

$$Y\_{\text{max}} = \sum\_{j=0}^{n-1} |r\_{\eta\_j}| \tag{15.8}$$

This correlation value occurs when there are no bit errors in transmission and provides an upper bound to the maximum achievable correlation for **x**˘. The correlation *Y*(**x**ˆ) may be expressed in terms of *Ymax* and **x**ˆ for

$$Y(\hat{\mathbf{x}}) = Y\_{\text{max}} - 2\sum\_{j=0}^{n-1} \hat{z}\_{\eta\_j} \times |r\_{\eta\_j}| \tag{15.9}$$

equivalently,

$$Y(\hat{\mathbf{x}}) = Y\_{\text{max}} - Y\_{\Delta}(\hat{\mathbf{x}}),\tag{15.10}$$

where *Y*Δ(**x**ˆ) is the shortfall from the maximum achievable correlation for the codeword **x**ˆ and is evidently

$$Y\_A(\hat{\mathbf{x}}) = 2 \sum\_{j=0}^{n-1} \hat{z}\_{\eta\_j} \times |r\_{\eta\_j}| \tag{15.11}$$

Some observations may be made about the binary vector **z**ˆ. The coordinates *z*ˆ<sup>η</sup>*<sup>j</sup>* for *j* = 0 to (*k* − 1) are always equal to zero. The maximum possible weight of **z**ˆ is thus *<sup>n</sup>* <sup>−</sup> *<sup>k</sup>* and the average weight is *<sup>n</sup>*−*<sup>k</sup>* <sup>2</sup> at low *Eb No* values. At high *Eb No* values, the average weight of **z**ˆ is small because there is a high chance that **x**ˆ is equal to the transmitted codeword. It may be seen from Eq. (15.11) that, in general, the lower the weight of **z**ˆ the smaller will be *Y*Δ(**x**ˆ) and the larger will be the correlation value *Y*(**x**ˆ).

Since there is no guarantee that the codeword **x**ˆ is the transmitted codeword, the decoder has to evaluate additional codewords since one or more of these may produce a correlation higher than **x**ˆ. There are 2*<sup>k</sup>* − 1 other codewords which may be derived by considering all other 2*<sup>k</sup>* − 1 sign combinations of *c*η*<sup>j</sup>* for *j* = 0 to *k* − 1. For any of these codewords denoted as **ci** the first term of the correlation given in Eq. (15.6) is bound to be smaller since

$$\sum\_{j=0}^{k-1} r\_{\eta\_{\parallel}} \times c\_{i, \eta\_{\parallel}} < \sum\_{j=0}^{k-1} |r\_{\eta\_{\parallel}}| \tag{15.12}$$

This is because there has to be, by definition, at least one sign change of *ci*,η*<sup>j</sup>* compared to *c*ˆ<sup>η</sup>*<sup>j</sup>* for *j* = 0 to *k* − 1. In order for *Y*(**xi**) to be larger than *Y*(**x**ˆ) the second term of the correlation *<sup>n</sup>*−<sup>1</sup> *<sup>j</sup>*=*<sup>k</sup> r*<sup>η</sup>*<sup>j</sup>* × *ci*,η*<sup>j</sup>* which uses the bits from the solved parity-check equations must be larger than *<sup>n</sup>*−<sup>1</sup> *<sup>j</sup>*=*<sup>k</sup> r*<sup>η</sup>*<sup>j</sup>* × ˆ*c*<sup>η</sup>*<sup>j</sup>* plus the negative contribution from the first term.

However, the first term has higher received magnitudes than the second term because the received coordinates are ordered. It follows that codewords likely to have a higher correlation than **x**ˆ will have small number of differences in the coordinates *x*<sup>η</sup>*<sup>j</sup>* for *j* = 0 to *k* − 1. As the code is linear these differences will correspond to a codeword and codewords may be generated that have low weight in coordinates *x*<sup>η</sup>*<sup>j</sup>* for *j* = 0 to *k* − 1. These codewords are represented as **x**˜**<sup>i</sup>** and referred to as low information weight codewords since coordinates *x*<sup>η</sup>*<sup>j</sup>* for *j* = 0 to *k* − 1 form an information set. Thus, codewords **xi** are given by

$$\mathbf{x\_i} = \hat{\mathbf{x}} \oplus \tilde{\mathbf{x\_i}} \tag{15.13}$$

and **x**˜**<sup>i</sup>** are codewords chosen to have increasing weight in coordinates *x*<sup>η</sup>*<sup>j</sup>* for *j* = 0 to *k* − 1 as *i* is incremented. This means that for increasing *i* it will become less likely that a codeword will be found that has higher correlation than the correlation of a codeword already found.

The difference in the correlation value *Y*Δ(**xi**) as a function of **x**˜**<sup>i</sup>** may be derived. Firstly, the binary vector **zi** is given by

$$\mathbf{z\_i} = \mathbf{b} \oplus \hat{\mathbf{x}} \oplus \tilde{\mathbf{x\_i}} \tag{15.14}$$

which may be simplified to

$$\mathbf{z\_i} = \hat{\mathbf{z}} \oplus \tilde{\mathbf{x\_i}} \tag{15.15}$$

The cross correlation *Y*(**xi**) is given by

$$Y(\mathbf{x\_i}) = Y\_{\text{max}} - 2\sum\_{j=0}^{n-1} z\_{i,\eta\_j} \times |r\_{\eta\_j}| \tag{15.16}$$

equivalently

$$Y(\mathbf{x\_i}) = Y\_{\text{max}} - Y\_{\Delta}(\mathbf{x\_i}) \tag{15.17}$$

The shortfall from maximum correlation, *Y*Δ(**xi**), is evidently

$$Y\_{\Delta}(\mathbf{x\_{i}}) = 2\sum\_{j=0}^{n-1} z\_{i,\eta\_{j}} \times |r\_{\eta\_{j}}| \tag{15.18}$$

Substituting for **zi** gives *Y*Δ(**xi**) as a function of **x**˜**i**.

$$Y\_{\Delta}(\mathbf{x}\_{\mathbb{i}}) = 2 \sum\_{j=0}^{n-1} (\hat{\mathbf{z}}\_{j} \oplus \tilde{\mathbf{x}}\_{i\eta\_{j}}) \times |r\_{\eta\_{j}}| \tag{15.19}$$

It is apparent that instead of the decoder determining *Y*(**xi**) for each codeword, **xi**, it is sufficient for the decoder to determine *Y*Δ(**xi**) for each codeword **x**˜**<sup>i</sup>** and compare the value with the smallest value obtained so far, denoted as *Y*Δ(**xmin**), starting with *Y*Δ(**x**ˆ):

$$Y\_{\Delta}(\mathbf{x}\_{\min}) = \min \left( Y\_{\Delta}(\mathbf{x}) \right) \tag{15.20}$$

Thus it is more efficient for the decoder to compute the correlation (partial sum) of the **x**˜**<sup>i</sup>** instead of deriving (**x**ˆ ⊕ **x**˜**i**) by solving**HI** and computing the squared Euclidean distance. Since codewords **x**˜**<sup>i</sup>** produce low weight in **zi**, the number of non-zero terms that need to be evaluated in Eq. (15.18) is typically *<sup>n</sup>*−*<sup>k</sup>* <sup>2</sup> rather than the *<sup>n</sup>* <sup>2</sup> terms of Eq. (15.1) which makes for an efficient, fast decoder. Before Eq. (15.19) is evaluated, the Hamming weight of **zi** may be compared to a threshold and the correlation stage bypassed if the Hamming weight of **zi** is high. There is an associated performance loss and results are presented in Sect. 15.4.

The maximum information weight *winf max* necessary to achieve maximum likelihood decoding may be upper bounded from *Y*Δ(**x**ˆ) and |*r*<sup>η</sup>*<sup>j</sup>* | initially, updated by *Y*Δ(**xmin**) as decoding progresses, since

$$Y\_{\Delta}(\mathbf{x}\_{\mathbf{l}}) \ge \sum\_{m=0}^{\eta\_{\rm inj}} |r\_{\eta\_{k-m-1}}| \tag{15.21}$$

This is reasonably tight since there is a possibility of at least one codeword with information weight *winf max*, for which all of the coordinates of the binary vector **zi** corresponding to the parity bits of **x**˜**<sup>i</sup>** are zero. Correspondingly, *winf max* is the smallest integer such that

$$\sum\_{m=0}^{\text{Wir}\_{\text{Wir}}} |r\_{\eta\_{k-m-1}}| \ge Y\_{\Delta}(\hat{\mathbf{x}}) \tag{15.22}$$

The codewords **x**˜**<sup>i</sup>** may be most efficiently derived from the**G** matrix corresponding to the solved **H** matrix because the maximum information weight given by Eq. (15.22) turns out to be small. Each row, *i*, of the solved **G** matrix is derived by setting *x*<sup>η</sup>*<sup>j</sup>* = 0 for *j* = 0 to *k* − 1, *j*=*i*, and using the solved parity-check equations to determine *x*<sup>η</sup>*<sup>j</sup>* for *j* = *k* to *n* − 1. The maximum number of rows of the **G** matrix that need to be combined to produce **x**˜**<sup>i</sup>** is *winf max*.

# **15.3 Number of Codewords that Need to Be Evaluated to Achieve Maximum Likelihood Decoding**

For each received vector the decoder needs to evaluate the correlation shortfall for the codewords **x**˜**<sup>i</sup>** for information weights up to the maximum information weight of *winf max* in order to achieve maximum likelihood decoding. The number of codewords that need to be evaluated is a function of the received vector. Not all of the codewords having information weight less than or equal to *winf max* need be evaluated because lower bounds may be derived for *Y*Δ(**xi**)in terms of the coordinates of the information bits, their total weight and the magnitudes of selected coordinates of the received vector. For an information weight of *winf* , *Y*Δ(**xi**) is lower bounded by

$$|Y\_{\Delta}(\mathbf{x\_{i}}) \ge |r\_{\eta\_{j}}| + \sum\_{m=0}^{\eta\_{i\eta'}-1} |r\_{\eta\_{i-m-1}}| \quad 0 \le j < k - m \tag{15.23}$$

and

$$|r\_{\eta\_{\min^{(\eta\_{\inf})}}}| \ge Y\_{\Delta}(\mathbf{x\_l}) - \sum\_{m=0}^{\eta\_{\inf} - 1} |r\_{\eta\_{k-m-1}}| \quad 0 \le j < k - m \tag{15.24}$$

where *jmin*(*winf*)is defined as the lower limit for *j* to satisfy Eq. (15.24). The minimum number of codewords that need to be evaluated as a function of the received vector *N*(**r**) is given by the total number of combinations

$$N(\mathbf{r}) = \sum\_{m=0}^{w\_{\text{inf}}} \binom{k - j\_{\text{min}}(m) - 1}{m} \tag{15.25}$$

For many short codes the minimum number of codewords that need to be evaluated is surprisingly small in comparison to the total number of codewords.

#### **15.4 Results for Some Powerful Binary Codes**

The decoder can be used with any linear code and best results are obtained for codes which have the highest known *dmin* for a given codelength *n* and number of information symbols *k*. The best binary codes are tabulated up to length 257 in Marcus Grassl's on line data base [7]. Non-binary codes, for example, ternary codes of length up to 243 symbols and GF(4) codes of length up to 256 symbols are also tabulated.

A particularly good class of codes are the binary self-dual, double-circulant codes first highlighted in a classic paper by Karlin [8]. For example the (24, 12, 8) extended Golay code is included since it may be put in double-circulant form. There is also the (48, 24, 12) bordered double-circulant code, based on quadratic residues of the prime 47 and the (136, 68, 24) bordered double-circulant code based on quadratic residues of the prime 67. These codes are extremal [3] and are doubly even, only having codeword weights that are a multiple of 4, and in these cases it is necessary that the codelengths are a multiple of 8 [3]. For higher code rates of length greater than 256, the best codes are tabulated in [12], and some of these include cyclic codes and Goppa codes.

#### *15.4.1 The (136, 68, 24) Double-Circulant Code*

This code is a bordered double-circulant code based on the identity matrix and a matrix whose rows consist of all cyclic combinations, modulo 1 + *x*67, of the polynomial *b*(*x*) defined by

$$b(\mathbf{x}) = 1 + \mathbf{x} + \mathbf{x}^4 + \mathbf{x}^6 + \mathbf{x}^9 + \mathbf{x}^{10} + \mathbf{x}^{14} + \mathbf{x}^{15} + \mathbf{x}^{16} + \mathbf{x}^{17} + \mathbf{x}^{19} + \mathbf{x}^{21} + \mathbf{x}^{22} + \mathbf{x}^{23} + \mathbf{x}^{24} + \mathbf{x}^{25} + \mathbf{x}^{26} + \mathbf{x}^{29} \tag{15.26}$$

$$+ \mathbf{x}^{3\delta} + \mathbf{x}^{3\delta} + \mathbf{x}^{3\delta} + \mathbf{x}^{3\delta} + \mathbf{x}^{3\theta} + \mathbf{x}^{4\theta} + \mathbf{x}^{4\theta} + \mathbf{x}^{4\theta} + \mathbf{x}^{5\theta} + \mathbf{x}^{6\theta} + \mathbf{x}^{8\theta} + \mathbf{x}^{6\theta} + \mathbf{x}^{6\theta} + \mathbf{x}^{6\theta} + \mathbf{x}^{6\theta} + \mathbf{x}^{6\theta} \tag{15.26}$$

The Frame Error Rate (FER) of this code using the extended Dorsch decoder with a maximum number of codewords limited to 3 × 10<sup>6</sup> is shown in Fig. 15.1. Also, shown in Fig. 15.1 is Shannon's [14] sphere packing bound offset by the loss for binary transmission [1], which is 0.19 dB for a code rate of <sup>1</sup> 2 .

**Fig. 15.1** FER as a function of *Eb No* for the double-circulant (136, 68, 24) code using incremental correlation decoding compared to the sphere packing bound, offset for binary transmission

It may be seen from Fig. 15.1 that the performance of the decoder in conjunction with the double-circulant code is within 0.2 dB of the best achievable performance for any (136, 68) code at 10−<sup>5</sup> FER. Interestingly, there is a significant number of maximum likelihood codeword errors which have a Hamming distance of 36 or 40 from the transmitted codeword. This indicates that a bounded distance decoder would not perform very well for this code. At the typical practical operating point of *Eb No* equal to 3.5 dB, the probability of the decoder processing each received vector as a maximum likelihood decoder is shown plotted in Fig. 15.2 as a function of the number of codewords evaluated.

Of course to guarantee maximum likelihood decoding, all 2<sup>68</sup> = 2.95 × 10<sup>20</sup> codewords need to be evaluated by the decoder. Equation (15.21) has been evaluated for the double-circulant (136, 68, 24) code in computer simulations, at an *Eb No* of 3.5 dB, for each received vector and the cumulative distribution derived. Figure 15.2 shows that by evaluating 10<sup>7</sup> codewords per received vector, 65% of received vectors are guaranteed to be maximum likelihood decoded. For the remaining 35% of received vectors, although maximum likelihood decoding is not guaranteed, the probability is very small that the codeword with the highest correlation is not the transmitted codeword or a codeword closer to the received vector than the transmitted codeword. This last point is illustrated by Fig. 15.3 which shows the FER performance of the decoder as a function of the maximum number of evaluated codewords.

**Fig. 15.2** Probability of a received vector being maximum likelihood decoded as a function of number of evaluated codewords for the (136, 68, 24) code at *Eb No* = 3.5 dB

**Fig. 15.3** FER performance of the (136, 68, 24) code as a function of number of evaluated codewords

**Fig. 15.4** An example of received coordinate magnitudes in their solved order for the (136, 68, 24) code at *Eb No* = 2.5 dB for a single received vector

The detailed operation of the decoder may be seen by considering an example of a received vector at *Eb No* of 2.5 dB. The magnitudes of the received coordinates, ordered in their solved order, is shown in Fig. 15.4. In this particular example, it is not possible to solve for ordered coordinates 67 and 68 (in their order prior to solving of the parity-check matrix) and so these coordinates are skipped and become coordinates 68 and 69, respectively, in the solved order. The transmitted bits are normalised with magnitudes 1 and the σ of the noise is ≈1.07. The shift in position of coordinate 69 (in original position) to 67 (in solved order) is evident in Fig. 15.4. The positions of the bits received in error in the same solved order is shown in Fig. 15.5. It may be noted that the received bit errors are concentrated in the least reliable bit positions. There are a total of 16 received bit errors and only two of these errors correspond to the (data) bit coordinates 11 and 34 of the solved **G** matrix. Evaluation of 10<sup>7</sup> codewords indicates that the minimum value of *Y*Δ(**xmin**) is ≈13.8, and this occurs for the 640th codeword producing a maximum correlation of ≈126.2 with *Ymax* ≈ 140. The weight of **zmin** is 16 corresponding to the 16 received bit errors.

In practice, it is not necessary for *Y*Δ(**xi**) given by the partial sum equation (15.18) to be evaluated for each codeword. In most cases, the weight of the binary vector **zi** is sufficiently high to indicate that this codeword is not the most likely codeword. Shown in Fig. 15.6 are the cumulative probability distributions for the weight of **zi** for the case where **xi** is equal to the transmitted codeword, and the case where it is not equal to the transmitted codeword. Two operating values for *Eb No* are shown: 3.5 dB and 4 dB. Considering the decoding rule that a weight 29 or more for **zi** is unlikely to be produced by the transmitted codeword means that 95.4% of candidate codewords

**Fig. 15.5** Received bits showing bit error positions for the same received vector and same order as that shown in Fig. 15.4

**Fig. 15.6** Cumulative probability distributions for the number of bit errors for the transmitted codeword and non-transmitted, evaluated codewords for the (136, 68, 24) code

may be rejected at this point, and that the partial sum equation (15.18) need only be evaluated for 4.6% of the candidate codewords. In reducing the decoder complexity in this way, the degradation to the FER performance as a result of rejection of a transmitted codeword corresponds to ≈3% increase in the FER and is not significant.

#### *15.4.2 The (255, 175, 17) Euclidean Geometry (EG) Code*

This code is an EG code originally used in hard decision, one-step majority-logic decoding by Lin and Costello, Jr. [10]. Finite geometry codes also have applications as LDPC codes using iterative decoding with the belief propagation algorithm [9]. The (255, 175, 17) code is a cyclic code and its parity-check polynomial *p*(*x*) may conveniently be generated from the cyclotomic idempotents as described in Chap. 12. The parity-check polynomial is

$$p(\mathbf{x}) = \mathbf{1} + \mathbf{x} + \mathbf{x}^3 + \mathbf{x}^7 + \mathbf{x}^{15} + \mathbf{x}^{26} + \mathbf{x}^{31} + \mathbf{x}^{53} + \mathbf{x}^{63} + \mathbf{x}^{98} \tag{15.27}$$

$$+x^{107} + x^{127} + x^{140} + x^{176} + x^{197} + x^{215} \tag{15.28}$$

The FER performance of the code is shown in Fig. 15.7 and was obtained using the incremental correlation decoder and is shown in comparison to using the iterative decoder. Also shown in Fig. 15.7 is the sphere packing bound offset by the binary transmission loss.

**Fig. 15.7** FER performance of the (255, 175, 17) EG code using belief propagation, iterative decoding, compared to incremental correlation decoding

Although this EG code performs well with iterative decoding it is apparent that the incremental correlation decoder is able to improve the performance of the code for the AWGN channel by 0.45 dB at 10−<sup>3</sup> FER.

#### *15.4.3 The (513, 467, 12) Extended Binary Goppa Code*

Goppa codes are frequently better than the corresponding BCH codes because there is an additional information bit and the Goppa code is only one bit longer than the BCH code. For example, the (512, 467, 11) binary Goppa Code has one more information bit than the (511, 466, 11) BCH code and may be generated by the irreducible Goppa polynomial 1 + *x*<sup>2</sup> + *x*5, whose roots have order 31 which is relatively prime to 511. The *dmin* of the binary Goppa code [12] is equal to twice the degree of the irreducible polynomial plus 1 and is the same as the (511, 466, 11) BCH code. The Goppa code may be extended by adding an overall parity check, increasing the *dmin* to 12.

The FER performance of the extended Goppa code is shown in Fig. 15.8 and was obtained using the incremental correlation decoder. Also shown in Fig. 15.8 is the sphere packing bound offset by the binary transmission loss. It can be seen that the realised performance of the decoder is within 0.3 dB at 10−4.

**Fig. 15.8** FER performance of the (513, 467, 12) binary Goppa code using incremental correlation decoding

**Fig. 15.9** FER performance of the (1023, 983, 9) binary BCH code using incremental correlation decoding compared to hard decision decoding

#### *15.4.4 The (1023, 983, 9) BCH Code*

This code is a standard BCH code that may be found in reference text book tables such as by Lin and Costello, Jr. [10]. This example is considered here in order to show that the decoder can produce near maximum likelihood performance for relatively long codes. The performance obtained is shown in Fig. 15.9 with evaluation of candidate codewords limited to 10<sup>6</sup> codewords. At 10−<sup>5</sup> FER, the degradation from the sphere packing bound, offset for binary transmission, is 1.8 dB. Although this may seem excessive, the degradation of hard decision decoding is 3.6 dB as may also be seen from Fig. 15.9.

#### **15.5 Extension to Non-binary Codes**

The extension of the decoder to non-binary codes is relatively straightforward, and for simplicity binary transmission of the components of each non-binary symbol is assumed. Codewords are denoted as before by **xi** but redefined with coefficients, γ*j i* from *GF*(2*<sup>m</sup>*)

$$\mathbf{x\_i} = (\boldsymbol{\gamma}\_{0\,i} \boldsymbol{x\_0}, \boldsymbol{\gamma}\_{1\,i} \boldsymbol{x\_1}, \boldsymbol{\gamma}\_{2\,i} \boldsymbol{x\_2}, \dots, \boldsymbol{\gamma}\_{n-1\,i} \boldsymbol{x\_{n-1}}) \tag{15.29}$$

The received vector **r** with coordinates ranked in order of those most likely to be correct is redefined as

$$\mathbf{r} = \sum\_{l=0}^{m-1} (r\_{l\mu\_0}, r\_{l\mu\_1}, r\_{l\mu\_2}, \dots, r\_{l\mu\_{n-1}}) \tag{15.30}$$

so that the received vector consists of n symbols, each with m values. The maximum attainable correlation *Ymax* is straightforward and is given by

$$Y\_{\max} = \sum\_{j=0}^{n-1} \sum\_{l=0}^{m-1} |r\_{lj}| \tag{15.31}$$

The hard decided received vector **r**, is redefined as

$$\mathbf{b} = \sum\_{j=0}^{n-1} \theta\_j \mathbf{x}^j \tag{15.32}$$

where θ*<sup>j</sup>* is the *GF*(2*<sup>m</sup>*) symbol corresponding to *sign*(*rl j*) for *l* = 0 to *m* − 1.

Decoding follows in a similar manner to the binary case. The received symbols are ordered in terms of their symbol magnitudes|*r*<sup>μ</sup>*<sup>j</sup>* |*<sup>S</sup>* where each symbol magnitude is defined as

$$\|r\_{\eta\_{\restriction}}\|\_{S} = \sum\_{l=0}^{m-1} |r\_{l\,\eta\_{\restriction}}|\tag{15.33}$$

The codeword **x**ˆ is derived from the *k* coordinates *x*<sup>η</sup>*<sup>j</sup>* whose coefficients νη*<sup>j</sup>* are the *GF*(2*<sup>m</sup>*) symbols corresponding to *sign*(*rl* <sup>η</sup>*<sup>j</sup>* ) for *l* = 0 to *m* − 1; for *j* = 0 to *k* − 1 and then using the solved parity-check equations for the remaining *n* − *k* coordinates.

The vector **zi** is given by

$$\mathbf{z}\_{\mathbf{i}} = \mathbf{b} \oplus \hat{\mathbf{x}} \oplus \tilde{\mathbf{x}}\_{\mathbf{i}} \mod GF(\mathcal{Z}^m) \tag{15.34}$$

which may be simplified as before to

$$\mathbf{z}\_{\mathsf{i}} = \hat{\mathbf{z}} \oplus \tilde{\mathbf{x}}\_{\mathsf{i}} \mod GF(\mathcal{Z}^{\mathsf{m}}) \tag{15.35}$$

Denoting the *n* binary vectors ρ*ilj* corresponding to the *n GF*(2*<sup>m</sup>*) coefficients of **zi**

$$Y(\mathbf{x\_i}) = Y\_{\text{max}} - Y\_{\Delta}(\mathbf{x\_i}) \tag{15.36}$$

where *Y*Δ(**xi**), the shortfall from maximum correlation is given by

$$Y\_{\Delta}(\mathbf{x}\_{l}) = 2 \sum\_{j=0}^{n-1} \sum\_{l=0}^{m-1} \rho\_{l} \boldsymbol{\iota}\_{l} \times |r\_{lj}| \tag{15.37}$$

In the implementation of the decoder, as in the binary case, the Hamming weight of the vector **zi** may be used to decide whether it is necessary to evaluate the soft decision metric given by Eq. (15.37) for each candidate codeword.

## *15.5.1 Results for the (63, 36, 13) GF(***4***) BCH Code*

This is a non-binary BCH code with the generator polynomial *g*(*x*) defined by roots

{α1 , α<sup>4</sup> , α<sup>16</sup>, α<sup>2</sup> , α<sup>8</sup> , α<sup>32</sup>, α<sup>3</sup> , α<sup>12</sup>, α<sup>48</sup>, α<sup>5</sup> , α<sup>20</sup>, α<sup>17</sup>, α<sup>6</sup> , α<sup>24</sup>, α<sup>33</sup>, α7 , α<sup>28</sup>, α<sup>29</sup>, α<sup>9</sup> , α<sup>36</sup>, α<sup>18</sup>, α<sup>10</sup>, α<sup>40</sup>, α<sup>34</sup>, α<sup>11</sup>, α<sup>44</sup>, α<sup>50</sup>}

**Fig. 15.10** FER performance of the (63, 36, 13) GF(4) BCH code using incremental correlation decoding compared to hard decision decoding

The benefit of having *GF*(4) coefficients is that *g*(*x*) does not need to contain the roots

$$\{\alpha^{14}, \alpha^{56}, \alpha^{35}, \alpha^{22}, \alpha^{25}, \alpha^{37}\}$$

which are necessary to constrain *g*(*x*) to binary coefficients [12]. Correspondingly, the binary version of this BCH code is the lower rate (63, 30, 13) code with 6 less information symbols (bits).

The performance of the (63, 36, 13) *GF*(4) BCH Code is shown in Fig. 15.10 for the AWGN channel using Quadrature Amplitude Modulation (QAM). Also shown in Fig. 15.10 is the performance of the code with hard decision decoding. It may be seen that at 10−<sup>4</sup> FER the performance of the incremental correlation decoder is 2.9 dB better than the performance of the hard decision decoder.

## **15.6 Conclusions**

It has been shown that the extended Dorsch decoder may approach maximum likelihood decoding by an incremental correlation approach in which for each received vector a partial summation metric is evaluated as a function of low information weight codewords. Furthermore, the number of information weight codewords that need to be evaluated to achieve maximum likelihood decoding may be calculated as an upper bound for each received vector. Consequently, for each received vector it is known whether the decoder has achieved maximum likelihood decoding. An efficient decoder structure consisting of a combination of hard decision threshold decoding followed by partial sum correlation was also described, which enables practical decoders to trade-off performance against complexity.

The decoder for non-binary codes was shown to be straightforward for the AWGN channel and an example was described for a GF(4) (63, 36, 13) BCH code using QAM to transmit each GF(4) symbol. It is readily possible to extend the decoder to other modulation formats by extensions to the incremental correlation of Eq. (15.37) although this inevitably involves an increase in complexity. It is hoped that there will sufficient interest from the coding community to address this research area.

Another interesting conclusion is just how well some codes in Brouwer's table perform with maximum likelihood decoding. In particular, the (136, 68, 24) doublecirculant, extremal, self-dual code is shown to be an outstanding code.

It seems that the implementation of this type of decoder coupled with the availability of powerful processors will eventually herald a new era in the application of error control coding with the re-establishment of the importance of the optimality of codes rather than the ease of decoding. Certainly, this type of decoder is more complex than an iterative decoder, but the demonstrable performance, which is achievable for short codes, can approach theoretical limits for error-correction coding performance such as the sphere packing bound.

## **15.7 Summary**

The current day, unobtainable goal of a practical realisation of the maximum likelihood decoder that can be used with any error-correcting code has been partially addressed with the description of the modified Dorsch decoder presented in this chapter. A decoder based on enhancements to the original Dorsch decoder has been described which achieves near maximum likelihood performance for all codes whose codelength is not too long. It is a practical decoder for half rate codes having a codelength less than about 180 bits or so using current digital processors. The performance achieved by the decoder when using different examples of outstanding binary codes has been evaluated and the results presented in this chapter. A description of the decoder suitable for use with non-binary codes has also been given. An example showing the results obtained by the decoder using a (63, 36, 13) GF(4) non-binary code for the AWGN channel has also been presented.

## **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the book's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the book's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Chapter 16 A Concatenated Error-Correction System Using the |***u***|***u* **+** *v***| Code Construction**

#### **16.1 Introduction**

There is a classical error-correcting code construction method where two good codes are combined together to form a new, longer code. It is a method first pioneered by Plotkin [1]. The Plotkin sum, also known as the |*u*|*u* + *v*| construction method [3], consists of one or more codes having replicated codewords to which are added codewords from one or more other codes to form a concatenated code. This code construction may be exploited in the receiver with a decoder that first decodes one or more individual codewords prior to the Plotkin sum from a received vector. The detected codewords from this first decoding are used to undo the code concatenation within the received vector to allow the replicated codewords to be decoded. The output from the overall decoder of the concatenated code consists of the information symbols from the first decoder followed by the information symbols from the second stage decoder. Multiple codewords may be replicated and added to the codewords from other codes so that the concatenated code consists of several shorter codewords which are decoded first and the decoded codewords used to decode the remaining codewords. It is possible to utilise a recurrent construction whereby the replicated codewords are themselves concatenated codewords. It follows that the receiver has to use more than two stages of decoding.

With suitable modifications, any type of error-correction decoder may be utilised including iterative decoders, Viterbi decoders, list decoders, and ordered reliability decoders, and of particular importance the modified Dorsch decoder described in Chap. 15. It is well known that for a given code rate longer codes have better performance than shorter codes, but implementation of a maximum likelihood decoder is much more difficult for longer codes. The Plotkin sum code construction method provides a means whereby several decoders for short codes may be used together to implement a near maximum likelihood decoder for a long code.

## **16.2 Description of the System**

Figure 16.1 shows the generic structure of the transmitted signal in which the codeword of length *n*<sup>1</sup> from code *u*, denoted as *C<sup>u</sup>* is followed by a codeword comprising the sum of the same codeword and another codeword from code *v*, denoted as *C<sup>v</sup>* to form a codeword denoted as *Ccat* of length 2*n*1. This code construction is well known as the |*u*|*u* + *v*| code construction [3]. The addition is carried out symbol by symbol using the arithmetic rules of the Galois Field being used, namely *G F*(*q*). If code *u* is an (*n*1, *k*1, *d*1) code with *k*<sup>1</sup> information symbols and Hamming distance *d*<sup>1</sup> and code *v* is an (*n*1, *k*2, *d*2) code with *k*<sup>2</sup> information symbols and Hamming distance *d*2, the concatenated code *Ccat* is an (2*n*1, *k*<sup>1</sup> + *k*2, *d*3) code with Hamming distance *d*<sup>3</sup> equal to the smaller of 2 × *d*<sup>1</sup> and *d*2.

Prior to transmission, symbols from the concatenated codeword are mapped to signal constellation points in order to maximise the Euclidean distance between transmitted symbols in keeping with current best transmission practice. For example see the text book by Professor J. Proakis [4]. The mapped concatenated codeword is denoted as *Xcat* and is given by

$$
\mathcal{X}'\_{\rm cat} = |\mathcal{X}'\_u| \mathcal{X}\_{u+v}| = |\mathcal{X}'\_u| \mathcal{X}\_w|,\tag{16.1}
$$

where *X<sup>w</sup>* is used to represent *X<sup>u</sup>*+*<sup>v</sup>*.

*Xcat* consists of 2 × *n*<sup>1</sup> symbols and the first *n*<sup>1</sup> symbols are the *n*<sup>1</sup> symbols of *X<sup>u</sup>* and the second *n*<sup>1</sup> symbols are the *n*<sup>1</sup> symbols resulting from mapping of the symbols resulting from the summation, symbol by symbol, of the *n*<sup>1</sup> symbols of *Cu*, and the *n*<sup>1</sup> symbols of codeword *Cv*.

The encoding system to produce the concatenated codeword format shown in Fig. 16.1 is shown in Fig. 16.2. For each concatenated codeword, *k*<sup>1</sup> information symbols are input to the encoder for the (*n*1, *k*1, *d*1) code and *n*<sup>1</sup> symbols are produced at the output of the encoder and are stored in the codeword buffer A as shown in Fig. 16.2. Additionally, for each concatenated codeword, *k*<sup>2</sup> information symbols are input to the encoder for the (*n*1, *k*2, *d*2) code and *n*<sup>1</sup> symbols are produced at the output and are stored in the codeword buffer B as shown in Fig. 16.2. The encoded symbols

**Fig. 16.1** Format of transmitted codeword consisting of two shorter codewords

**Fig. 16.2** Concatenated code encoder and mapping for transmission

output from the codeword buffer A are added symbol by symbol to the encoded symbols output from the codeword buffer B and the results are stored in codeword buffer C. The codeword stored in codeword buffer A is *C<sup>u</sup>* as depicted in Fig. 16.1 and the codeword stored in codeword buffer C is *C<sup>u</sup>* +*C<sup>v</sup>* as also depicted in Fig. 16.1. The encoded symbols output from the codeword buffer A are mapped to transmission symbols and transmitted to the channel, and these are followed sequentially by the symbols output from the codeword buffer C which are also mapped to transmission symbols and transmitted to the channel as shown in Fig. 16.2.

After transmission through the communications medium each concatenated mapped codeword is received as the received vector, denoted as *Rcat* and given by

$$
\beta \partial\_{\rm cut} = |\beta \partial\_u| \beta \delta\_{u+v}| = |\beta \partial\_u| \beta \delta\_{\rm w}|.\tag{16.2}
$$

Codeword *C<sup>v</sup>* is decoded first as shown in Fig. 16.3. It is possible by comparing the received samples *R<sup>u</sup>* with the received samples *R<sup>u</sup>*+*<sup>v</sup>* that the a priori log likelihoods of the symbols of *R<sup>v</sup>* may be determined, since it is clear that the difference between the respective samples, in the absence of noise and distortion, is attributable to *Cv*. This is done by the soft decision metric calculator shown in Fig. 16.3.

Binary codeword symbols are considered with values which are either 0 or 1. The ith transmitted sample, *Xui* = (−1) *Cui* and the *n*<sup>1</sup> + ith transmitted sample, *Xui*+*vi* = (−1) *Cui* <sup>×</sup>(−1) *Cvi* . It is apparent that *Xvi* and *Cvi* may be derived from *Xui* and *Xui*+*vi* .

An estimate of *Xvi* and *Cvi* may be derived from *Rui* and *Rui*+*vi* . First:

$$X\_{v\_i} = X\_{u\_i} \times X\_{u\_i + v\_i} = (-1)^{C\_{v\_i}} \times (-1)^{C\_{v\_i}} \times (-1)^{C\_{v\_i}} = (-1)^{C\_{v\_i}} \tag{16.3}$$

Second, in the absence of distortion and with Gaussian distributed additive noise with standard deviation σ, and normalised signal power, the log likelihood that *Cvi* = 0, *Llog*(*Cvi* = 0) is given by

$$L\_{\log}(C\_{v\_l} = 0) = \log \left[ \cosh \left( \frac{R\_{u\_l} + R\_{u\_l + v\_l}}{\sigma^2} \right) \right] - \log \left[ \cosh \left( \frac{R\_{u\_l} - R\_{u\_l + v\_l}}{\sigma^2} \right) \right]. \tag{16.4}$$

**Fig. 16.3** Decoder for the concatenated code with the codeword format shown in Fig. 16.1

The soft decision metric calculator, shown in Fig. 16.3, calculates these log likelihoods according to Eq. (16.4) and these are input to the decoder A shown in Fig. 16.3. The decoder A determines the most likely codeword *C<sup>v</sup>*<sup>ˆ</sup> of the (*n*1, *k*2, *d*2) code. With the knowledge of the detected codeword, *C<sup>v</sup>*ˆ, the received samples *R<sup>u</sup>*+*<sup>v</sup>*, which are stored in the *n*<sup>1</sup> symbols buffer B, are remapped to form *R<sup>u</sup>*<sup>ˆ</sup> by multiplying *R<sup>u</sup>*+*<sup>v</sup>* by *X<sup>v</sup>*ˆ.

$$
\mathcal{A}\_{\hat{u}} = \mathcal{A}\_{u+v} \times \mathcal{A}\_{\hat{v}} \tag{16.5}
$$

This remapping function is provided by the remapper shown in Fig. 16.3. The output of the remapper is *R<sup>u</sup>*ˆ. If the decoder's output is correct, *C<sup>v</sup>*<sup>ˆ</sup> = *C<sup>v</sup>* and there are now two independent received versions of the transmitted, mapped codeword *C<sup>u</sup>* , *R<sup>u</sup>*<sup>ˆ</sup> and the original received *Ru*. Both of these are input to the soft metric combiner shown in Fig. 16.3, *R<sup>u</sup>*<sup>ˆ</sup> from the output of the remapper and *R<sup>u</sup>* from the output of the *n*<sup>1</sup> symbols buffer A.

The soft metric combiner calculates the log likelihood of each bit of *Cu*, *Cui* from the sum of the individual log likelihoods:

$$L\_{\log}(C\_{u\_i} = 0) = \frac{2R\_{u\_i}}{\sigma^2} + \frac{2R\_{\hat{u}\_i}}{\sigma^2}.\tag{16.6}$$

These log likelihood values, *L*log(*Cui* = 0), output from the soft metric combiner shown in Fig. 16.3 are input to the decoder B. The output of Decoder B is the *k*<sup>1</sup> information bits of the detected codeword *C<sup>u</sup>*<sup>ˆ</sup> of the (*n*1, *k*1, *d*1) code, and these are input to the information symbols buffer shown in Fig. 16.3. The other input to the information symbols buffer is the *k*<sup>2</sup> information bits of the detected codeword

**Fig. 16.4** Format of transmitted codeword consisting of three shorter codewords

*C<sup>v</sup>*<sup>ˆ</sup> of the (*n*1, *k*2, *d*2) code, provided at the output of decoder A. The output of the information symbols buffer, for each received vector, is the *k*<sup>1</sup> + *k*<sup>2</sup> information bits which were originally encoded, provided both decoders' outputs, A and B, are correct.

In similar fashion to previous constructions, Fig. 16.4 shows the format of a concatenated codeword of length 4 × *n*<sup>1</sup> symbols consisting of three shorter codewords. The codeword of length 2 × *n*<sup>1</sup> from a (2*n*1, *k*1, *d*1), code *u*, denoted as *C<sup>u</sup>* is replicated as shown in Fig. 16.4. The first half of the replicated codeword, *Cu*, is added to the codeword *C<sup>v</sup>*<sup>1</sup> and the second half of the replicated codeword, *Cu*, is added to the codeword *Cv*2, as shown in Fig. 16.4. Each codeword *C<sup>v</sup>*<sup>1</sup> and *C<sup>v</sup>*<sup>2</sup> is the result of encoding *k*<sup>2</sup> information symbols using code *v*, a (*n*1, *k*2, *d*2) code. The concatenated codeword that results, *Ccat* , is from a (4*n*1, *k*<sup>1</sup> + 2*k*2, *d*3) concatenated code where *d*<sup>3</sup> is the smaller of 2*d*<sup>1</sup> or *d*2.

The decoder for the concatenated code with codeword format shown in Fig. 16.4 is similar to the decoder shown in Fig. 16.3 except that following soft decision metric calculation each of the two codewords *C<sup>v</sup>*<sup>1</sup> and *C<sup>v</sup>*<sup>2</sup> are decoded independently. With the knowledge of the detected codewords, *C<sup>v</sup>*ˆ<sup>1</sup> and *C<sup>v</sup>*ˆ<sup>2</sup> , the received samples *R<sup>u</sup>*+*v*<sup>1</sup> , which are buffered, are remapped to form the first *n*<sup>1</sup> symbols of *R<sup>u</sup>*<sup>ˆ</sup> by multiplying *R<sup>u</sup>*+*v*<sup>1</sup> by *X<sup>v</sup>*ˆ<sup>1</sup> and the second *n*<sup>1</sup> symbols of *R<sup>u</sup>*<sup>ˆ</sup> are obtained by multiplying *R<sup>u</sup>*+*v*<sup>2</sup> by *X<sup>v</sup>*ˆ<sup>2</sup> .The two independent received versions of the transmitted, mapped codeword *Cu*, *R<sup>u</sup>*<sup>ˆ</sup> and the original received *R<sup>u</sup>* are input to a soft metric combiner prior to decoding the codeword *C<sup>u</sup>*ˆ.

In another code arrangement, Fig. 16.5 shows the format of a concatenated codeword of length 3 × *n*<sup>1</sup> symbols. The concatenated codeword is the result of three layers of concatenation. A codeword of length *n*<sup>1</sup> from a (*n*1, *k*1, *d*1), code *u*, denoted as *C<sup>u</sup>* is replicated twice, as shown in Fig. 16.5. A second codeword of length *n*<sup>1</sup> from a (*n*1, *k*2, *d*2), code *v*, denoted as *C<sup>v</sup>* is replicated and each of these two codewords is added to the two replicated codewords *Cu*, as shown in Fig. 16.5. A third codeword of length *n*<sup>1</sup> from a (*n*1, *k*3, *d*3), code *w*, denoted as *C<sup>w</sup>* is added to the codeword summation *C<sup>u</sup>* + *Cv*, as shown in Fig. 16.5. The concatenated codeword that results, *Ccat* , is from a (3*n*1, *k*<sup>1</sup> + *k*<sup>2</sup> + *k*3, *d*4) concatenated code where *d*<sup>4</sup> is the smallest of 3*d*<sup>1</sup> or 2*d*<sup>2</sup> or *d*3.

**Fig. 16.5** Format of transmitted codeword consisting of two levels of concatenation and three shorter codewords

**Fig. 16.6** Format of transmitted codeword after two stages of concatenation

The decoder for the three layered concatenated code with codeword format shown in Fig. 16.5 uses similar signal processing to the decoder shown in Fig. 16.3 with changes corresponding to the three layers of concatenation. The codeword *C<sup>w</sup>*<sup>ˆ</sup> is decoded first following soft decision metric calculation using the*R<sup>u</sup>*+*<sup>v</sup>* and *R<sup>u</sup>*+*v*+*<sup>w</sup>* sections of the received vector. The detected codeword *C<sup>w</sup>*<sup>ˆ</sup> is used to obtain two independent received versions of the transmitted, mapped result of the two codewords summation *C<sup>u</sup>*+*<sup>v</sup>*, *R<sup>u</sup>*+ˆ *<sup>v</sup>* and the original received *R<sup>u</sup>*+*<sup>v</sup>*. These are input to a soft metric combiner and the output is input to the soft decision metric calculation together with *Ru*, prior to decoding of codeword *C<sup>v</sup>*ˆ. With the knowledge of codeword *C<sup>v</sup>*ˆ, remapping and soft metric combining is carried out prior to the decoding of codeword *C<sup>u</sup>*ˆ.

Figure 16.6 shows the format of a concatenated codeword of length 4 × *n*<sup>1</sup> symbols. The concatenated codeword is the result of three layers of concatenation. A concatenated codeword with the format shown in Fig. 16.1 is replicated and added to a codeword, *C<sup>w</sup>*<sup>ˆ</sup> , of length 2*n*<sup>1</sup> symbols from a (2*n*1, *k*3, *d*3) code to form a codeword of an overall concatenated code having parameters (4*n*1, *k*<sup>1</sup> + *k*<sup>2</sup> + *k*3, *d*4), where *d*<sup>4</sup> is equal to the smallest of 4*d*1, 2*d*<sup>2</sup> or *d*3.

The decoder for the three layered concatenated code with codeword format shown in Fig. 16.6 is similar to the decoder described above. Codeword *C<sup>w</sup>*<sup>ˆ</sup> is detected first following soft decision metric calculation using *R<sup>u</sup>* and *R<sup>u</sup>*+*<sup>v</sup>* sections of the received vector as one input and the *R<sup>u</sup>*+*<sup>w</sup>* and *R<sup>u</sup>*+*v*+*<sup>w</sup>* sections of the received vector as the other input. The detected codeword *C<sup>w</sup>*<sup>ˆ</sup> is used to obtain two independent received versions of the concatenated codeword of length 2*n*<sup>1</sup> symbols with format equal to that of Fig. 16.1. Accordingly, following soft metric combining of the two independent received versions of the concatenated codeword of length 2*n*<sup>1</sup> symbols, a vector of length equal to 2*n*<sup>1</sup> symbols is obtained which may be input to the concatenated code decoder shown in Fig. 16.3. This decoder provides at its output the *k*<sup>1</sup> + *k*<sup>2</sup> detected information symbols which together with the *k*<sup>3</sup> information symbols already detected provide the complete detected output of the overall three layer concatenated code.

Any type of code, binary or non-binary, LDPC, Turbo or algebraically constructed code, may be used. Any corresponding type of decoder, for example an iterative decoder or a list decoder may be used. As an illustration of this, Decoder A and Decoder B, shown in Fig. 16.3, do not have to be the same type of decoder.

There are particular advantages in using the modified Dorsch decoder, described in Chap. 15, because the Dorsch decoder may realise close to maximum likelihood decoding, with reasonable complexity of the decoder. The complexity increases exponentially with codelength. Using modified Dorsch decoders. Both decoder A and decoder B shown in Fig. 16.3 operate on *n*<sup>1</sup> received samples and may realise close to maximum likelihood decoding with reasonable complexity even though the concatenated codelength is 2×*n*<sup>1</sup> symbols and the total number of received samples is 2×*n*<sup>1</sup> samples. Using a single modified Dorsch decoder to decode the 2×*n*<sup>1</sup> samples of the concatenated code directly will usually result in non-maximum likelihood performance unless the list of codewords evaluated for each received vector is very long. For example, a modified Dorsch decoder with moderate complexity, typically will process 100,000 codewords for each received vector and realise near maximum likelihood performance. Doubling the codelength will require typically in excess of 100,000,000 codewords to be processed for each received vector if near maximum likelihood performance is to be maintained.

An example of the performance that may be achieved is shown in Fig. 16.7 for the concatenated codeword format shown in Fig. 16.1. The encoder used is the same as that shown in Fig. 16.2 and the concatenated code decoder is the same as that shown in Fig. 16.3. The results were obtained by computer simulation using Quaternary Phase Shift Keying (QPSK) modulation and featuring the Additive White Gaussian Noise (AWGN) channel. The decoder error rate, the ratio of the number of incorrect codewords output by the decoder to the total number of codewords output by the decoder, is denoted by the Frame Error Rate (FER) and this is plotted against *Eb No* , the ratio of the energy per information bit to the noise power spectral density. Binary codes are used and the length of the concatenated code is 256 bits. For best results, it is important to use outstanding codes for the constituent codes, particularly for code *v* which is decoded first. In this example, code *u* is the (128,92,12) extended Bose Chaudhuri Hocquenghem (BCH) code. Code *v* is the (128,36,36) extended cyclic code, an optimum code described in [5] by D. Schoemaker and M. Wirtz. The (128,36,36) extended cyclic code is not an extended BCH code as it has roots {1, 3, 5, 7, 9, 11, 13, 19, 21, 27, 43, 47, 63}. The minimum Hamming distance

**Fig. 16.7** The error rate performance for a (256,128,24) concatenated code compared to iterative decoding of a (256,128,15) Turbo code and a (256,128,12) LDPC code

of the concatenated code is 2*d*<sup>1</sup> = 24. Both decoder A and decoder B, as shown in Fig. 16.3, are a modified Dorsch decoder and for both code *u* and code *v*, near maximum likelihood performance is obtained with moderate decoder complexity. For each point plotted in Fig. 16.7, the number of codewords transmitted was chosen such that were at least 100 codewords decoded in error.

Also shown in Fig. 16.7 is the performance of codes and decoders designed according to the currently known state of the art in error-correction coding that is Low Density Parity Check (LDPC) codes using Belief Propagation (BP) iterative decoding, and Turbo codes with BCJR iterative decoding. Featured in Fig. 16.7 is the performance of an optimised Low Density Parity Check (LDPC) (256,128,12) code using BP, iterative decoding and an optimised (256,128,15) Turbo code with iterative decoding. As shown in Fig. 16.7 both the (256,128,15) Turbo code and the (256,128,12) LDPC code suffer from an error floor for *Eb No* values higher than 3.5dB whilst the concatenated code features a FER performance with no error floor. This is attributable to the significantly higher minimum Hamming distance of the concatenated code which is equal to 24in comparison to 15 for the Turbo code and 12 for the LDPC code. Throughout the entire range of *Eb No* values the concatenated code can be seen to outperform the other codes and decoders.

For (512,256) codes, using the concatenated code arrangement, the performance achievable is shown in Fig. 16.8. The concatenated code arrangement uses the

**Fig. 16.8** Comparison of the error rate performance for a (512,256,32) concatenated code compared to iterative decoding of a (512,256,18) Turbo code and a (512,256,14) LDPC code

concatenated codeword format which is shown in Fig. 16.4. As before, the FER results were obtained by computer simulation using QPSK modulation and the AWGN channel. Both codes *v*<sup>1</sup> and *v*<sup>2</sup> are the same and equal to the outstanding (128,30,38) best-known code [6]. Code *u* is equal to a (256,196,16) extended cyclic code. Featured in Fig. 16.8 is the performance of an optimised Low Density Parity Check (LDPC) (512,256,14) code using BP iterative decoding and an optimised (512,256,18) Turbo code with iterative decoding. For each point plotted in Fig. 16.8, the number of codewords transmitted was chosen such that were at least 100 codewords decoded in error. As shown in Fig. 16.8 both the (512,256,18) Turbo code and the (512,256,14) LDPC code suffer from an error floor for *Eb No* values higher than 3.4 dB whilst the concatenated code features a FER performance with no error floor. As before this is attributable to the significantly higher minimum Hamming distance of the concatenated code which is equal to 32in comparison to 18 for the Turbo code and 14 for the LDPC code. Throughout the entire range of *Eb No* values, the concatenated code system can be seen to outperform the other coding arrangements for (512,256) codes.

## **16.3 Concatenated Coding and Modulation Formats**

With the |*u*|*u* + *v*| code construction and binary transmission, the received vector for the codeword of code *v* suffers the full interference from the codeword of code *u* because it is transmitted as *u* + *v*. The interference is removed by differential detection using the first version of the codeword of code *u*. However, although the effects of code *u* are removed, differential detection introduces additional noise power due to noise times noise components. One possible solution to reduce this effect is to use multi-level modulation such as 8-PSK. Code *u* is transmitted as 4-PSK and code *v* modulates the 4-PSK constellation by ±22.5 degrees. Now there is less direct interference between code *u* and code *v*. Initial investigations show that this approach is promising, particularly for higher rate systems.

#### **16.4 Summary**

Concatenation of good codes is a classic method of constructing longer codes which are good. As codes are increased in length, it becomes progressively harder to realise a near maximum likelihood decoder. This chapter presented a novel concatenated code arrangement featuring multiple near maximum likelihood decoders for an optimised matching of codes and decoders. It was demonstrated that by using some outstanding codes as constituent codes, the concatenated coding arrangement is able to outperform the best LDPC and Turbo coding systems with the same code parameters. The performance of a net (256,128) code achieved with the concatenated arrangement is compared to a best (256,128) LDPC code and a best (256,128) Turbo code. Similarly, the performance of a (512,256) net concatenated code is compared to a best (512,256) LDPC code and a best (512,256) Turbo code. In both cases, the new system was shown to outperform the LDPC and Turbo systems. To date, for the AWGN channel and net, half rate codes no other codes or coding arrangement is known that will outperform the system presented in this chapter for codes of lengths 256 and 512 bits.

#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the book's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the book's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Part IV Applications**

This part is concerned with a wide variety of applications using error-correcting codes. Analysis is presented of combined error-detecting and error-correcting codes which enhance the reliability of digital communications by using the parity check bits for error detection as well as using the parity check bits for error-correction. A worked example of code construction for a (251, 113, 20) incremental redundancy errorcorrecting code is described. The idea is that additional sequences of parity bits may be transmitted in stages until the decoded codeword satisfies a Cyclic Redundancy Check (CRC). A soft decision scheme for measuring codeword reliability is also described which does not require a CRC to be transmitted. The relative performance of the undetected error rate and throughput of the different systems is presented.

In this part it is also shown that error-correcting codes may be used for the automatic correction of small errors in password authentication systems or in submitting personal identification information. An adaptive mapping of *GF(q)* symbols is used to convert a high percentage of passwords into Reed–Solomon codewords without the need for additional parity check symbols. It is shown that a BCH decoder may be used for error-correction or error detection.Worked examples of codes and passwords are included.

Goppa codes are used as the basis of a public key cryptosystem invented by Professor Robert McEliece. The way in which Goppa codes are designed into the cryptosystem is illustrated with step by step worked examples showing how a ciphertext is constructed and subsequently decrypted. The cryptosystem is described in considerable detail together with some proposed system variations designed to reduce the ciphertext length with no loss in security. An example is presented in which the system realises 256 bits of security, normally requiring a ciphertext length of 8192 bits, that uses a ciphertext length of 1912 bits.

Different attacks, designed to break the McEliece cryptosystem are described including the information set decoding attack. Analysis is provided showing the security level achieved by the cryptosystem as a function of Goppa code length. Vulnerabilities of the standard McEliece to chosen plaintext and chosen ciphertext attacks are described, together with system modifications that defeat these attacks. Some commercial applications are described that are based on using a smartphone for secure messaging and cloud based, encrypted information access.

The use of error-correcting codes in impressing watermarks on different media by using dirty paper coding is included in this part. The method described, is based on firstly decoding the media or white noise with a cross correlating decoder so as to find sets of codewords from a given code that will cause minimum change to the media, but still be detectable. The best codewords are added to the media as a watermark so as to convey additional information as in steganography. Some examples are included using a binary (47, 24, 11) quadratic residue code.

# **Chapter 17 Combined Error Detection and Error-Correction**

## **17.1 Analysis of Undetected Error Probability**

Let the space of vectors over a field with *q* elements F*<sup>q</sup>* of length *n* be denoted by F*<sup>n</sup> q* . Let [*n*, *<sup>k</sup>*, *<sup>d</sup>*]*<sup>q</sup>* denote a linear code over <sup>F</sup>*<sup>q</sup>* of length *<sup>n</sup>* symbols, dimension *<sup>k</sup>* symbols and minimum Hamming distance *d*. We know that a code with minimum Hamming distance *d* can correct *t* = -(*d* −1)/2 errors. It is possible for an [*n*, *k*, *d* = 2*t* +1]*<sup>q</sup>* linear code, which has *qn*−*<sup>k</sup>* syndromes, to use a subset of these syndromes to correct τ < *t* errors and then to use the remaining syndromes for error detection. For convenience, let *C* denote an [*n*, *k*, *d*]*<sup>q</sup>* linear code with cardinality |*C* |, and let a codeword of *C* be denoted by *c<sup>l</sup>* = (*cl*,<sup>0</sup>, *cl*,<sup>1</sup>,..., *cl*,*n*−<sup>1</sup>), where 0 ≤ *l* < |*C* |.

Consider a codeword *c<sup>i</sup>* , for some integer *i*, which is transmitted over a *q*-ary symmetric channel with symbol transition probability *p*/(*q* − 1). At the receiver, a length *n* vector *y* is received. This vector *y* is not necessarily the same as *c<sup>i</sup>* and, denoting d*<sup>H</sup>* (*a*, *b*) as the Hamming distance between vectors *a* and *b*, the following possibilities may occur assuming that nearest neighbour decoding algorithm is employed:


**Definition 17.1** A sphere of radius *<sup>t</sup>* centered at a vector *<sup>v</sup>* <sup>∈</sup> <sup>F</sup>*<sup>n</sup> <sup>q</sup>* , denoted by *S<sup>t</sup> <sup>q</sup>* (*v*), is defined as

$$S\_q^t(\mathbf{v}) = \{ \mathbf{w} \mid \mathbf{w} t\_H(\mathbf{v} - \mathbf{w}) \le t \text{ for all } \mathbf{w} \in \mathbb{F}\_q^n \}. \tag{17.1}$$

It can be seen that, in an error-detection-after-correction case, *S*<sup>τ</sup> *<sup>q</sup>* (*c*) may be drawn around all |*C* | codewords of the code *C* . For any vector falling within *S*<sup>τ</sup> *<sup>q</sup>* (*c*), the decoder returns *c* the corresponding codeword which is the center of the sphere. It is worth noting that all these |*C* | spheres are pairwise disjoint, i.e.

```
© The Author(s) 2017
M. Tomlinson et al., Error-Correction Coding and Decoding,
Signals and Communication Technology,
DOI 10.1007/978-3-319-51103-0_17
```
435

$$\bigcup\_{\substack{0 \le i,j \prec |\Diamond^{\otimes}| \\ i \ne j}} S\_q^{\tau}(\mathfrak{c}\_i) \cap S\_q^{\dagger}(\mathfrak{c}\_j) = \emptyset.$$

In a pure error-detection scenario, the radius of these spheres is zero and the probability of an undetected error is minimised. When the code is used to correct a given number of errors, the radius increases and so does the probability of undetected error.

**Lemma 17.1** *The number of length n vectors over* F*<sup>q</sup> of weight j within a sphere of radius* τ *centered at a length n vector of weight i, denoted by N*<sup>τ</sup> *<sup>q</sup>* (*n*,*i*, *j*)*, is equal to*

$$N\_q^{\varepsilon}(n, i, j) = \sum\_{e=\epsilon\_L}^{e\_U} \sum\_{\delta=\delta\_L}^{\delta\_U} \binom{i}{e} \binom{e}{\delta} \binom{n-i}{j-i+\delta} (q-1)^{j-i+\delta} (q-2)^{e-\delta} \tag{17.2}$$

*where eL* = max(0,*i* − *j*)*, eU* = min(τ, τ + *i* − *j*)*,* δ*<sup>L</sup>* = max(0,*i* − *j*) *and* δ*<sup>U</sup>* = min(*e*, τ + *i* − *j* − *e*, *n* − *j*)*.*

*Proof* Let *u* be a vector of weight *i* and let sup(*u*) and sup(*u*) denote the support of *u*, and the non-support of *u*, respectively, that is

$$\begin{aligned} \sup(\mathfrak{u}) &= \{ i \mid \mu\_i \neq 0, \text{ for } 0 \le i \le n - 1 \}, \\ \overline{\sup}(\mathfrak{u}) &= \{ 0, 1, \dots, n - 1 \} \backslash \sup(\mathfrak{u}). \end{aligned}$$

A vector of weight *j*, denoted by *v*, may be obtained by adding a vector *w*, which has *e* coordinates which are the elements of sup(*u*) and *f* coordinates which are the elements of sup(*u*). In the case where *q* > 2, considering the coordinates in sup(*u*), it is obvious that vector *v* = *u* + *w* can have more than *i* − *e* non-zeros in these coordinates. Let δ, where 0 ≤ δ ≤ *e*, denote the number of coordinates for which *vi* = 0 among sup(*u*) of *v*, i.e.

$$\delta = |\sup(\mathfrak{u})\backslash(\sup(\mathfrak{u}) \cap \sup(\mathfrak{v}))|$$

Given an integer *e*, there are *<sup>i</sup> e* ways to generate *e* coordinates for which *wi* = 0 where *<sup>i</sup>* <sup>∈</sup> sup(*u*). For each way, there are *<sup>e</sup> e*−δ (*q* − 2)*<sup>e</sup>*−<sup>δ</sup> ways to generate *e* − δ non-zeros in the coordinates sup(*u*) ∩ sup(*w*) such that *vi* = 0. It follows that *<sup>f</sup>* <sup>=</sup> *<sup>j</sup>* <sup>−</sup> (*<sup>i</sup>* <sup>−</sup> *<sup>e</sup>*) <sup>−</sup> (*<sup>e</sup>* <sup>−</sup> δ) <sup>=</sup> *<sup>j</sup>* <sup>−</sup> *<sup>i</sup>* <sup>+</sup> <sup>δ</sup> and there are *<sup>n</sup>*−*<sup>i</sup> j*−*i*+δ (*q* − 1)*<sup>j</sup>*−*i*+<sup>δ</sup> ways to generate *f* non-zero coordinates such that *vi* = 0 where *i* ∈ sup(*u*). Therefore, for given integers *e* and δ, we have

$$
\binom{i}{e}\binom{e}{\delta}\binom{n-i}{j-i+\delta}(q-1)^{j-i+\delta}(q-2)^{e-\delta}\tag{17.3}
$$

vectors *<sup>w</sup>* that produce *wtH* (*v*) <sup>=</sup> *<sup>j</sup>*. Note that *<sup>e</sup> e*−δ <sup>=</sup> *<sup>e</sup>* δ .

It is obvious that 0 ≤ *e*, *f* ≤ τ and *e* + *f* ≤ τ . In the case of *j* ≤ *i*, the integer *e* may not take the entire range of values from 0 to τ , it is not possible to have *e* < *i* − *j*. On the other hand, for *j* ≥ *i*, the integer *e* ≥ 0 and thus, the lower limit on the value of *e* is *eL* = max(0,*i* − *j*). The upper limit of *e*, denoted by *eU* , is dictated by the condition *e* + *f* = τ . For *j* ≤ *i*, *eU* = τ since for any value of *e*, δ may be adjusted such that *wtH* (*v*) = *j*. For the case *j* ≥ *i*, *f* ≥ 0 and for any value of *e*, there exists at least one vector for which δ = 0, implying *eU* = τ − *f* = τ + *i* − *j*. It follows that *eU* = min(τ, τ + *i* − *j*).

For a given value of *e*, δ takes certain values in the range between 0 and *e* such that *wtH* (*v*) = *j*. The lower limit of δ is obvious δ*<sup>L</sup>* = *eL* . The upper limit of δ for *j* ≥ *i* case is also obvious, δ*<sup>U</sup>* = *e*, since *f* ≥ 0. For the case *j* ≤ *i*, we have *e* + *f* = *e* + (*j* − *i* + δ*<sup>U</sup>* ) ≤ τ , implying δ*<sup>U</sup>* ≤ τ − *e* + *i* − *j*. In addition, *n* − *i* ≥ *j* − *i* + δ*<sup>U</sup>* and thus, we have δ*<sup>U</sup>* = min(*e*, τ − *e* + *i* − *j*, *n* − *j*).

**Corollary 17.1** *For q* = 2*, we have*

$$N\_2^t(n, i, j) = \sum\_{e=\max(0, i-j)}^{\lfloor (t+i-j)/2 \rfloor} \binom{i}{e} \binom{n-i}{j-i+e} \tag{17.4}$$

*Proof* For *q* = 2, it is obvious that δ = *e* and 0<sup>0</sup> = 1. Since *e* + *f* ≤ τ and *f* = *j* − *i* + *e*, the upper limit of *e*, *eL* , becomes *eL* ≤ -(τ + *i* − *j*)/2.

**Theorem 17.1** *For an* [*n*, *k*, *d* = 2*t* + 1]*<sup>q</sup> linear code C , the probability of undetected error after correcting at most* τ *errors, where* τ ≤ *t, in a q-ary symmetric channel with transition probability p*/(*q* − 1)*, is given by*

$$P\_{\rm uc}^{(\tau)}(\ell^{\mathcal{E}}, p) = \sum\_{i=d}^{n} A\_i \sum\_{j=i-\tau}^{i+\tau} N\_q^{\varepsilon}(n, i, j) \left(\frac{p}{q-1}\right)^j (1-p)^{n-j} \tag{17.5}$$

*where Ai is the number of codewords of weight i in C and N*<sup>τ</sup> *<sup>q</sup>* (*n*,*i*, *j*) *is given in Lemma 17.1.*

*Proof* An undetected error occurs if the received vector falls within a sphere of radius τ centered at any codeword *C* except the transmitted codeword. Without loss of generality, as the code is linear, the transmission of the all zeros codeword may be assumed. Consider *c<sup>i</sup>* a codeword of weight *i* > 0, all vectors within *S*<sup>τ</sup> *<sup>q</sup>* (*ci*) have weights ranging from *i* −τ to *i* +τ with respect to the transmitted all zeros codeword. For each weight *j* in the range, there are *N*<sup>τ</sup> *<sup>q</sup>* (*n*,*i*, *j*) such vectors in the sphere.

Following [2], if *Bj* denotes the number of codewords of weight *j* in *C* <sup>⊥</sup>, the dual code of *C* , *Aj* may be written as

$$A\_m = \frac{1}{|\ell^{\mathcal{E}}|} \sum\_{i=0}^n B\_i P\_q(n, m, i) \tag{17.6}$$

where

$$P\_q(n,m,i) = \sum\_{j=0}^{m} (-1)^j q^{m-j} \binom{n-m+j}{j} \binom{n-i}{m-j} \tag{17.7}$$

is a Krawtchouk polynomial. Using (17.6) and (17.7), the probability of undetected error after error-correction (17.5) may be rewritten in terms of the weight of the codewords in the dual code.

#### **17.2 Incremental-Redundancy Coding System**

#### *17.2.1 Description of the System*

The main area of applications is two-way digital communication systems with particular importance to wireless communication systems which feature packet digital communications using a two-way communications medium. In wireless communications, each received packet is subject to multipath effects and noise plus interference causing errors in some of the received symbols. Typically forward error-correction (FEC) is provided using convolutional codes, turbo codes, LDPC codes, or algebraic block codes and at the receiver a forward error-correction decoder is used to correct any transmission errors. Any residual errors are detected using a cyclic redundancy check (CRC) which is included in each transmitted codeword. The CRC is calculated for each codeword that is decoded from the corresponding received symbols and if the CRC is not satisfied, then the codeword is declared to be in error. If such an error is detected, the receiver requests the transmitter by means of a automatic repeat request (ARQ) either to retransmit the codeword or to transmit additional redundant symbols. Since this is a hybrid form of error-correction coupled with error-detection feedback through the ARQ mechanism, it is commonly referred to as a hybrid automatic repeat request (HARQ) system.

The two known forms of HARQ are Chase combining and incremental redundancy (IR). Chase combining is a simplified form of HARQ, wherein the receiver simply requests retransmission of the original codeword and the received symbols corresponding to the codeword are combined together prior to repeated decoding and detection. IR provides for a transmission of additional parity symbols extending the length of the codeword and increasing the minimum Hamming distance, *dmin* between codewords. This results in a lower error rate following decoding of the extended codeword. The average throughput of such a system is higher than a fixed code rate system which always transmits codewords of maximum length and redundancy. In HARQ systems, it is a prerequisite that a reliable means be provided to detect errors in each decoded codeword. A system is described below which is able to provide an improvement to current HARQ systems by providing a more reliable means of error detection using the CRC and also provides for an improvement in

**Fig. 17.1** Codeword format for conventional incremental-redundancy ARQ schemes

throughput by basing the error detection on the reliability of the detected codeword without the need to transmit the CRC.

Figure 17.1 shows the generic structure of the transmitted signal for a punctured codeword system. The transmitted signal comprises the initial codeword followed by additional parity symbols which are transmitted following each ARQ request up to a total of *M* transmissions for each codeword. All of the different types of codes used in HARQ systems: convolutional codes, turbo codes, LDPC codes, and algebraic codes can be constructed to fit into this generic codeword structure. As shown in Fig. 17.1, the maximum length of each codeword is *nM* symbols transmitted in a total of *M* transmissions resulting from the reception of *M* − 1 negative ACK's (NACK's). The first transmission consists of *m* information symbols encoded into a total of *n*<sup>1</sup> symbols. There are *r*<sup>1</sup> parity symbols in addition to the CRC symbols. This is equivalent to puncturing the maximum length codeword in the last *nM* −*n*<sup>1</sup> symbols. If this codeword is not decoded correctly, a NACK is received by the transmitter, (indicated either by the absence of an ACK being received or by a NACK signal being received), and *r*<sup>2</sup> parity symbols are transmitted as shown in Fig. 17.1.

The detection of an incorrect codeword is derived from the CRC in conventional HARQ systems. After the decoding of the received codeword, the CRC is recalculated and compared to the CRC symbols contained in the decoded codeword. If there is no match, then an incorrect codeword is declared and a NACK is conveyed to the transmitter. Following the second transmission, the decoder has a received codeword consisting of *n*<sup>1</sup> + *r*<sup>2</sup> symbols which are decoded. The CRC is recalculated and compared to the decoded CRC symbols. If there is still no match, a NACK is conveyed to the transmitter and the third transmission consists of the *r*<sup>3</sup> parity symbols and the net codeword consisting of *n*<sup>1</sup> + *r*<sup>2</sup> + *r*<sup>3</sup> symbols is decoded, and so on. The IR procedure ends either when an ACK is received by the transmitter or when a codeword of total length *nM* symbols has been transmitted in a total of *M* transmissions.

Most conventional HARQ systems first encode the *m* information symbols plus CRC symbols into a codeword of length *nM* symbols, where *C<sup>M</sup>* = [*nM* , *k*, *dM* ] denotes this code. The code *C<sup>M</sup>* is then punctured by removing the last *nM* − *nM*−<sup>1</sup>

**Fig. 17.2** Codeword format for the incremental-redundancy ARQ scheme without a CRC

symbols to produce a code *CM*−<sup>1</sup> = [*nM*−<sup>1</sup>, *k*, *dM*−1], the code *CM*−<sup>1</sup> is then punctured by removing the last *nM*−<sup>1</sup> − *nM*−<sup>2</sup> symbols to produce a code *CM*−2, and so forth until a code *C*<sup>1</sup> = [*n*1, *k*, *d*1] is obtained. In this way, a sequence of codes *C*<sup>1</sup> = [*n*1, *k*, *d*1], *C*<sup>2</sup> = [*n*2, *k*, *d*2], ..., *C<sup>M</sup>* = [*nM* , *k*, *dM* ] is obtained. In the first transmission stage, a codeword *C*<sup>1</sup> is transmitted, in the second transmission stage, the punctured parity symbols of *C*<sup>2</sup> is transmitted and so on as shown in Fig. 17.1.

An alternative IR code construction method is to produce a sequence of codes using a generator matrix formed from a juxtaposition of the generator matrices of a nested block code. In this way, no puncturing is required.

Figure 17.2 shows the structure of the transmitted signal. The transmitted signal format is the same as Fig. 17.1 except that no CRC symbols are transmitted. The initial codeword consists only of the *m* information symbols plus the *r*<sup>1</sup> parity symbols. Additional parity symbols are transmitted following each ARQ request up to a total of *M* transmissions for each codeword. All of the different types of codes used in HARQ systems: convolutional codes, turbo codes, LDPC codes, and algebraic codes may be used in this format including the sequence of codes based on a nested block code construction.

Figure 17.3 shows a variation of the system where the *k* information symbols, denoted by vector *u*, are encoded with the forward error-correction (FEC) encoder into *nM* symbols denoted as *c<sup>M</sup>* which are stored in the transmission controller. In the first transmission, *n*<sup>1</sup> symbols are transmitted. At the end of the *i*th stage, a codeword of total length *ni* symbols has been transmitted. This corresponds to a codeword of length *nM* symbols punctured in the last *nM* −*ni* symbols. In Fig. 17.3, the codeword of length *ni* is represented as a vector *v*, which is then passed through the channel to produce *y* and buffered in the Received buffer as *y* which is forward error-correction (FEC) decoded in the FEC decoder which produces the most likely codeword *c*<sup>1</sup> and the next most likely codeword *c*2.

**Fig. 17.3** The incremental-redundancy ARQ scheme with adjustable reliability without using a CRC

Let us consider that the IR system has had *i* transmissions so that a total of *ni* symbols have been received and the total length of the transmitted codeword is *ni* symbols.

*c*<sup>1</sup> is denoted as

$$\mathcal{L}\_1 = c\_{10} + c\_{11}\mathbf{x} + c\_{12}\mathbf{x}^2 + c\_{13}\mathbf{x}^3 + c\_{14}\mathbf{x}^4 + \dots + c\_{1(n\_i - 1)}\mathbf{x}^{n\_i - 1} \tag{17.8}$$

and *c*<sup>2</sup> is denoted as

$$\mathbf{c}\_2 = c\_{20} + c\_{21}\mathbf{x} + c\_{22}\mathbf{x}^2 + c\_{23}\mathbf{x}^3 + c\_{24}\mathbf{x}^4 + \dots + c\_{2(n\_i - 1)}\mathbf{x}^{n\_i - 1} \tag{17.9}$$

and the received symbols *y* are denoted as

$$\mathbf{y} = \mathbf{y}\_0 + \mathbf{y}\_1 \mathbf{x} + \mathbf{y}\_2 \mathbf{x}^2 + \mathbf{y}\_3 \mathbf{x}^3 + \mathbf{y}\_4 \mathbf{x}^4 + \dots + \mathbf{y}\_{(n\_i - 1)2} \mathbf{x}^{n\_i - 1} \tag{17.10}$$

For each decoded codeword, *c*<sup>1</sup> and *c*2, the squared Euclidean distances *d*<sup>2</sup> *<sup>E</sup>* ( *y*, *c*1) and *d*<sup>2</sup> *<sup>E</sup>* ( *y*, *c*2) respectively are calculated between the codewords and the received symbols *y* stored in the Received buffer.

*d*2 *<sup>E</sup>* ( *y*, *c*1) is given by

$$d\_E^2(\mathbf{y}, \mathbf{c}\_1) = \sum\_{j=0}^{n\_l - 1} (\mathbf{y}\_j - c\_{1j})^2 \tag{17.11}$$

*d*2 *<sup>E</sup>* ( *y*, *c*2) is given by

$$d\_E^2(\mathbf{y}, \mathbf{c}\_2) = \sum\_{j=0}^{n\_l - 1} (\mathbf{y}\_j - c\_{2j})^2 \tag{17.12}$$

The function of the Reliability estimator shown in Fig. 17.3 is to determine how much smaller is *d*<sup>2</sup> *<sup>E</sup>* ( *y*, *c*1) compared to *d*<sup>2</sup> *<sup>E</sup>* ( *y*, *c*2) in order to estimate the likelihood that the codeword *c*<sup>1</sup> is correct. The Reliability estimator calculates the squared Euclidean distances *d*<sup>2</sup> *<sup>E</sup>* ( *y*, *c*1) and *d*<sup>2</sup> *<sup>E</sup>* ( *y*, *c*2), and determines the difference Δ given by

$$
\Delta = d\_E^2(\mathbf{y}, \mathbf{c}\_2) - d\_E^2(\mathbf{y}, \mathbf{c}\_1) \tag{17.13}
$$

Δ is compared to a threshold which is calculated from the minimum Hamming distance of the first code in the sequence of codes, the absolute noise power, and a multiplicative constant, termed κ. As shown in Fig. 17.3, Δ is compared to the threshold by the Comparator. If Δ is not greater than the threshold, *c*<sup>1</sup> is considered to be insufficiently reliable, and the output of the comparator causes the ACK/NACK generator to convey a NACK to the transmitter for more parity symbols to be transmitted. If Δ is greater than or equal to the threshold then *c*<sup>1</sup> is considered to be correct, the output of the comparator causes the ACK/NACK generator to convey an ACK to the transmitter and in turn, the ACK/NACK generator causes the switch to close and *c*<sup>1</sup> is switched to the output *u***ˆ**. The ACK causes the entire IR procedure to begin again with a new vector *u*. The way that Δ works as an indication of whether the codeword *c*<sup>1</sup> is correct or not. If *c*<sup>1</sup> is correct, then *d*<sup>2</sup> *<sup>E</sup>* ( *y*, *c*1) is a summation of squared noise samples only because the signal terms cancel out. The codeword *c*<sup>2</sup> differs from *c*<sup>1</sup> in a number of symbol positions equal to at least the minimum Hamming distance of the current code, *dmin*. With the minimum squared Euclidean distance between symbols defined as *d*<sup>2</sup> *<sup>S</sup>* , Δ will be greater or equal to *dmin* ×*d*<sup>2</sup> *<sup>S</sup>* plus a noise term dependent on the signal to noise ratio. If *c*<sup>1</sup> is not correct *d*<sup>2</sup> *<sup>E</sup>* ( *y*, *c*1) and *d*2 *<sup>E</sup>* ( *y*, *c*2) will be similar and Δ will be small.

If more parity symbols are transmitted because Δ is less than the threshold, the *dmin* of the code increases with each increase of codeword length and provided *c*<sup>1</sup> is correct, Δ will increase accordingly.

The Reliability measure shown in Fig. 17.3 uses the squared Euclidean distance but it is apparent that equivalent soft decision metrics including cross-correlation and log likelihood may be used to the same effect.

In the system shown in Fig. 17.4 a CRC is transmitted in the first transmitted codeword. The *m* information symbols, shown as vector *u* in Fig. 17.4 are encoded with the CRC encoder to form a total of *k* symbols, shown as vector *x*. The *k* symbols are encoded by the FEC encoder into *nM* symbols denoted as *c<sup>M</sup>* which are stored in the transmission controller. In the first transmission, *n*<sup>1</sup> symbols are transmitted. At the end of the *i*th stage, a codeword of total length *ni* symbols has been transmitted. This corresponds to a codeword of length *nM* symbols punctured in the last *nM* − *ni* symbols. In Fig. 17.4, the codeword of length *ni* is represented as a vector *v*, which is then passed through the channel to produce *y* and buffered in the Received buffer as *y*, which is forward error-correction (FEC) decoded in the FEC decoder. The FEC decoder produces *L* codewords with decreasing reliability as measured by the squared Euclidean distance between each codeword and the received symbols or as measured by an equivalent soft decision metric such as cross-

**Fig. 17.4** The incremental-redundancy ARQ scheme with adjustable reliability using a CRC

correlation between each codeword and the received symbols. The *L* codewords are input to CRC checking which determines the most reliable codeword, *c<sup>j</sup>* , which satisfies the CRC and the next most reliable codeword, *c<sup>l</sup>* , which satisfies the CRC. The Reliability estimator shown in Fig. 17.4 determines the difference, Δ, of the squared Euclidean distances between codewords *c<sup>j</sup>* and *c<sup>l</sup>* and the corresponding received symbols.

Δ is given by

$$
\Delta = d\_E^2(\mathbf{y}, \mathbf{c}\_l) - d\_E^2(\mathbf{y}, \mathbf{c}\_j) \tag{17.14}
$$

Δ is compared to a threshold which is calculated from the minimum Hamming distance of the first code in the sequence of codes, the absolute noise power, and a multiplicative constant termed κ. As shown in Fig. 17.4, Δ is compared to the threshold by the comparator. If Δ is not greater than the threshold, *c<sup>j</sup>* is considered to be insufficiently reliable, and the output of the comparator causes the ACK/NACK generator to convey a NACK to the transmitter for more parity symbols to be transmitted. If Δ is greater than or equal to the threshold then *c<sup>j</sup>* is considered to be correct, the output of the comparator causes the ACK/NACK generator to convey an ACK to the transmitter and in turn, the ACK/NACK generator causes the switch to close and *c<sup>j</sup>* is switched to the output *u***ˆ**. The ACK causes the entire IR procedure to begin again with a new vector *u*.

The Reliability measure shown in Fig. 17.4 uses the squared Euclidean distance but it is apparent that equivalent soft decision metrics including cross correlation and log likelihood ratios may be used to the same effect.

#### **17.2.1.1 Code Generation Using Nested Block Codes**

If *<sup>C</sup>* is a cyclic code, then there exists a generator polynomial *<sup>g</sup>*(*x*) <sup>∈</sup> <sup>F</sup>2[*x*] and a parity-check polynomial *<sup>h</sup>*(*x*) <sup>∈</sup> <sup>F</sup>2[*x*] such that *<sup>g</sup>*(*x*)*h*(*x*) <sup>=</sup> *<sup>x</sup> <sup>n</sup>*<sup>1</sup> <sup>−</sup> 1. Two cyclic codes, *C*<sup>1</sup> with *g*1(*x*) as the generator polynomial and *C*<sup>2</sup> with *g*2(*x*) as the generator polynomial, are said to be chained or nested, if *g*1(*x*)|*g*2(*x*), and we denote them by *C*<sup>1</sup> ⊃ *C*2. With reference to this definition, it is clear that narrow-sense BCH codes of the same length form a chain of cyclic codes. Given a chain of two codes, using a code construction method known as Construction X, a construction method first described by Sloane et al. [5], the code with larger dimension can be lengthened to produce a code with increased length and minimum distance.

A generalised form of Construction X involves more than two codes. Let *B<sup>i</sup>* be an [*n*1, *ki*, *di*] code, given a chain of *M* codes, *B*<sup>1</sup> ⊃ *B*<sup>2</sup> ⊃ ··· ⊃ *B<sup>M</sup>* , and a set of auxiliary codes *A<sup>i</sup>* = [*n <sup>i</sup>*, *k <sup>i</sup>*, *d <sup>i</sup>*], for 1 ≤ *i* ≤ *M* − 1, where *k <sup>i</sup>* = *k*<sup>1</sup> − *ki* , a code *<sup>C</sup><sup>X</sup>* = [*n*<sup>1</sup> <sup>+</sup> *<sup>M</sup>*−<sup>1</sup> *<sup>i</sup>*=<sup>1</sup> *n <sup>i</sup>*, *k*1, *d*] can be constructed, where *d* = min{*dM* , *dM*−<sup>1</sup> + *d <sup>M</sup>*−<sup>1</sup>, *dM*−<sup>2</sup> + *d <sup>M</sup>*−<sup>2</sup> + *d <sup>M</sup>*−<sup>1</sup>,..., *<sup>d</sup>*<sup>1</sup> <sup>+</sup> *<sup>M</sup>*−<sup>1</sup> *<sup>i</sup>*=<sup>1</sup> *d i*}.

Denoting *z* as a vector of length *n*<sup>1</sup> formed by the first *n*<sup>1</sup> coordinates of a codeword of *C<sup>X</sup>* . A codeword of *C<sup>X</sup>* is a juxtaposition of codewords of *B<sup>i</sup>* and *A<sup>i</sup>* , where

$$\begin{array}{ccccc|c} \left( \begin{array}{c|c|c|c|c} \mathbf{b}\_{M} & \mathbf{0} & \mathbf{0} & \dots & \mathbf{0} & \mathbf{0} & \mathbf{0} & \mathbf{0} \\ \mathbf{0}\_{M-1} & \mathbf{0} & \mathbf{0} & \dots & \mathbf{0} & \mathbf{0} & \mathbf{a}\_{M-1} \end{array} \right) & \text{if } \mathbf{z} \in \mathcal{\mathcal{B}}\_{M-1}, \\\ \mathbf{(}\mathbf{b}\_{M-2} & \mathbf{0} & \mathbf{0} & \dots & \mathbf{a}\_{M-2} & \mathbf{a}\_{M-1} \end{array} \text{ if } \mathbf{z} \in \mathcal{\mathcal{B}}\_{M-1}, \\\ \begin{array}{c|c|c|c|c} \vdots & & \vdots & & \vdots \\ \vdots & & \vdots & & \vdots \\ \mathbf{(}\mathbf{b}\_{2} & \mathbf{0} & \mathbf{a}\_{2} \mid \dots & \mathbf{a}\_{M-2} & \mathbf{a}\_{M-1} \mid \quad \text{if } \mathbf{z} \in \mathcal{\mathcal{B}}\_{2}, \\\ \mathbf{(}\mathbf{b}\_{1} & \mathbf{a}\_{1} \mid \mathbf{a}\_{2} \mid \dots & \mathbf{a}\_{M-2} & \mathbf{a}\_{M-1} \mid \quad \text{if } \mathbf{z} \in \mathcal{\mathcal{B}}\_{1}, \end{array} \end{array}$$

where *b<sup>i</sup>* ∈ *B<sup>i</sup>* and *a<sup>i</sup>* ∈ *A<sup>i</sup>* .

#### **17.2.1.2 Example of Code Generation Using Nested Block Codes**

There exists a chain of extended BCH codes of length 128 bits,

$$\begin{aligned} \; \mathcal{A}\_1 &= [128, 113, 6] \supset \mathcal{A}\_2 = [128, 92, 12] \supset \mathcal{A}\_3 = [128, 78, 16] \supset \mathcal{A}\_4 = [128, 71, 20] \\ \; \mathcal{A}\_4 &= [128, 71, 20] .\end{aligned}$$

Applying Construction X to [128, 113, 6]⊃[128, 92, 12] with an [32, 21, 6] extended BCH code as auxiliary code, a [160, 113, 12] code is obtained, giving

$$\{160, 113, 12\} \supset \{160, 92, 12\} \supset \{160, 78, 16\} \supset \{160, 71, 20\}.$$

Additionally, using a [42, 35, 4] shortened extended Hamming code as the auxiliary code in applying Construction X to [160, 113, 12]⊃[160, 78, 16], giving

[202, 113, 16]⊃[202, 92, 16]⊃[202, 78, 16]⊃[202, 71, 20].

Finally, applying Construction X to [202, 113, 16]⊃[202, 71, 20] with the shortened extended Hamming code [49, 42, 4] as the auxiliary code, giving

$$\{251, 113, 20\} \supset \{251, 92, 20\} \supset \{251, 78, 20\} \supset \{251, 71, 20\}.$$

The resulting sequence of codes which are used in this example are [128, 113, 6], [160, 113, 12], [202, 113, 16] and [251, 113, 20].

The generator matrix of the last code, the [251, 113, 20] code is given by

**G** = ⎛ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎝ **I**<sup>71</sup> −**R**<sup>4</sup> 0 0 0 **I**<sup>7</sup> −**R**<sup>3</sup> **G***A*<sup>3</sup> **<sup>0</sup> <sup>I</sup>**<sup>14</sup> <sup>−</sup>**R**<sup>2</sup> **<sup>G</sup>***A*<sup>2</sup> **I**<sup>21</sup> −**R**<sup>1</sup> **G***A*<sup>1</sup> ⎞ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎠ . (17.15)

On the left hand side of the double bar, the generator matrix of the code *B*<sup>1</sup> is decomposed along the chain *B*<sup>1</sup> ⊃ *B*<sup>2</sup> ⊃ *B*<sup>3</sup> ⊃ *B*4. The matrices **G***A<sup>i</sup>* , for 1 ≤ *i* ≤ 3 are the generator matrices of the auxiliary codes *A<sup>i</sup>* .

This generator matrix is used to generate each entire codeword of length *nM* = 251 bits, but these bits are not transmitted unless requested. The first 128 bits of each entire codeword are selected to form the codeword of the code [128, 113, 6] and are transmitted first, bit 0 through to bit 127. The next transmission (if requested by the IR system) consists of 32 parity bits. These are bit 128 through to bit 159 of the entire codeword. These 32 parity bits plus the original 128 bits form a codeword of the [160, 113, 12] code. The next transmission (if requested by the IR system) consists of 42 parity bits. These are bit 160 through to bit 201 of the entire codeword. These 42 parity bits plus the previously transmitted 160 bits form a codeword from the [202, 113, 16] code. The last transmission (if requested by the IR system) consists of 49 parity bits. These are the last 49 bits, bit 202 through to bit 250, of the entire codeword. These 49 parity bits plus the previously transmitted 202 bits form a codeword from the [251, 113, 20] code. The sequence of increasing length codewords with each transmission (if requested by the IR system) has a minimum Hamming distance which starts with 6, increases from 6 to 12, then to 16 and finally, to 20. In turn this will produce an increasing reliability given by Eq. (17.13) or (17.14) depending on the type of system.

A completely different method of generating nested codes is to use the external parity checks, augmentation method first suggested by Goppa in which independent columns are added incrementally to the parity-check matrix. The method is described in detail in Chap. 6 and can be applied to any Goppa or BCH code.

In order to be used in the HARQ systems, a FEC decoder is needed that will decode these nested block codes. One such universal decoder is the modified Dorsch decoder described in Chap. 15 and results using this decoder are presented below.

#### **17.2.1.3 List Decoder for Turbo and LDPC Codes**

If LDPC or turbo codes are to be used, the HARQ system needs a decoder that provides several codewords at its output in order that the difference between the squared Euclidean distances (or an equivalent soft decision metric) of the most likely transmitted codeword and the next most likely transmitted codeword may be determined and compared to the threshold. For turbo codes, the conventional decoder is not a list decoder but Narayanan and Stuber [3] show how a list decoder may be provided for turbo codes. Similarly for LDPC codes, Kristensen [1] shows how a list decoder may be provided for LDPC codes.

#### **17.2.1.4 Performance Results Using the Nested Codes**

Computer simulations using the nested codes constructed above have been carried out featuring all three HARQ systems. These systems include the traditional HARQ system using hard decision checks of the CRC and the two new systems featuring the soft decision, decoded codeword/received vector check, with or without a CRC. All of the simulations of the three systems have been carried out using a modified Dorsch decoder as described in Chap. 15. The modified Dorsch decoder can be easily configured as a list decoder with hard and soft decision outputs.

For each one of the nested codes, the decoder exhibits almost optimum maximum likelihood performance by virtue of its delta correlation algorithm corresponding to a total of 10<sup>6</sup> codewords, that are closest to the received vector, being evaluated each time there is a new received vector to input. Since the decoder knows which of the nested codes it is decoding, it is possible to optimise the settings of the decoder for each code.

For the CRC cases, an 8 bit CRC polynomial (1 + *x*)(1 + *x* <sup>2</sup> + *x* <sup>5</sup> + *x* <sup>6</sup> + *x* <sup>7</sup>) was used, the 8 CRC bits being included in each codeword. It should be noted that in calculating the throughput these CRC bits are not counted as information bits. In the CRC cases, there are 105 information bits per transmitted codeword. In the computer simulations, an ACK is transmitted if Δ is greater than threshold or there have been *M* IR transmissions, otherwise a NACK is transmitted.

The traditional HARQ system using a CRC is compared to the new system not using a CRC in Figs. 17.5 and 17.6. The comparative frame error rate (FER) performance is shown in Fig. 17.5 and the throughput is shown in Fig. 17.6 as a function

**Fig. 17.5** The error rate performance in comparison to the classical HARQ scheme using a CRC

of the average *Eb No* ratio. The traditional CRC approach shows good throughput, but exhibits an early error-floor of the FER, which is caused by undetected error events. The FER performance shows the benefit of having increased reliability of error detection compared to the traditional CRC approach. Two threshold settings are provided using the multiplicative constant κ and the effects of these are shown in Figs. 17.5 and 17.6. It is apparent from the graphs that the threshold setting may be used to trade-off throughput against reduced FER. The improvements in both throughput and FER provided by the new HARQ systems compared to the conventional HARQ system, featuring a hard decision CRC check, are evident from Figs. 17.5 and 17.6.

The comparative FER performance and throughput with a CRC compared to not using a CRC is shown in Figs. 17.7 and 17.8 for the new system where the threshold is fixed by κ = 1. The new system using a CRC shows an improvement in FER, Fig. 17.7, over the entire range of average *Eb No* and an improvement in throughput, Fig. 17.8, also over the entire range of average *Eb No* compared to the traditional HARQ approach using a CRC.

**Fig. 17.6** The throughput performance without using a CRC in comparison to the classical HARQ scheme using a CRC

**Fig. 17.7** The error rate performance using a CRC in comparison to the classical HARQ scheme using a CRC

**Fig. 17.8** The throughput performance with a CRC in comparison to the classical HARQ scheme using a CRC

#### **17.3 Summary**

This chapter has discussed the design of codes and systems for combined error detection and correction, primarily aimed at applications featuring retransmission of data packets which have not been decoded correctly. Several such Hybrid Automatic ReQuest, HARQ, systems have been described including a novel system variation which uses a retransmission metric based on a soft decision; the Euclidean distance between the decoded codeword and the received vector. It has been shown that a cyclic redundancy check, CRC, is not essential for this system and need not be transmitted.

It has also been shown how to construct the generator matrix of a nested set of block codes of length 251 bits by applying Construction X three times in succession starting with an extended BCH (128, 113, 6) code. The resulting nested codes have been used as the basis for an incremental-redundancy system whereby the first 128 bits transmitted is a codeword from the BCH code, followed by the transmission of a further 32 bits, if requested, producing a codeword from a (160, 113, 12) code. Further requests for additional transmitted bits finally result in a codeword from a (251, 113, 20) code, each time increasing the chance of correct decoding by increasing the minimum Hamming distance of the net received codeword. Performance graphs have been presented showing the comparative error rate performances and throughputs of the new HARQ systems compared to the standard HARQ system. The advantages of lower error floors and increased throughputs are evident from the presented graphs.

#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the book's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the book's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Chapter 18 Password Correction and Confidential Information Access System**

#### **18.1 Introduction and Background**

Following the trend of an increasing need for security and protection of confidential information, personal access codes and passwords are increasing in length with the result that they are becoming more difficult to remember correctly. The system described in this chapter provides a solution to this problem by correcting small errors in an entered password without compromising the security of the access system. Moreover, additional levels of security are provided by the system by associating passwords with codewords of an error-correcting code and using a dynamic, userspecific, mapping of Galois field symbols. This defeats password attacking systems consisting of Rainbow tables because each user transmits what appears to be a random byte stream as a password. A description of this system was first published by the authors as a UK patent application in 2007 [1].

The system is a method for the encoding and decoding of passwords and the encoding and decoding of confidential information which is accessed by use of these passwords. Passwords may be composed of numbers and alphanumeric characters and easily remembered names, phrases or notable words are the most convenient from the point of view of users of the system.

Passwords are associated with the codewords of an error-correcting code and consequently any small number of errors of an entered password may be automatically corrected. Several additional parity symbols may be calculated by the system to extend the password length prior to hashing so as to overwhelm any attacks based on Rainbow tables. Dynamic mapping of code symbols is used to ensure a password when first registered by the user is a codeword of the error-correcting code. In this process sometimes a password, due to symbol contradictions, cannot be a codeword and an alternative word or phrase, which is a codeword, is offered to the user by the system. Alternatively, the user may elect to register a different password.

Feedback can be provided to users by the system of the number of errors corrected for each user, re-entered password. Valid passwords are associated with a subset of the totality of all the codewords of the error-correcting code and an entered password, may be confirmed to the user, as a valid password, or not.

Confidential information, for example Personal Identification Numbers (PIN)s, bank account numbers, safe combinations, or more general confidential messages, are encrypted at source and stored as a sequence of encrypted messages. Each encrypted message is uniquely associated with a cryptographic hash of a valid codeword. Retrieval of the confidential information is achieved by the user entering a password which is equal to the corresponding valid password or differs from a valid password in a small number of character positions. Any small number of errors are corrected automatically and feedback is provided to the user that a valid password has been decoded. The valid password is mapped to a single codeword from a very large number of codewords that comprise the error-correcting code.

On receiving a valid hash, the cloud sends the stored encrypted message that corresponds to the valid codeword. The encryption key may be derived from the reconstituted password in conjunction of other user entered credentials, such as a fingerprint. The retrieved encrypted message is decrypted and the confidential information displayed to the user.

Security is provided by the system at a number of different levels. Codewords of the error-correcting code are composed of a sequence of symbols with each symbol taken from a set of Galois Field (GF) elements. Any size of Galois Field may be used provided the number of GF elements is greater or equal to the alphabet of the language used to construct passwords. The mapping of alphabet characters to GF elements may be defined uniquely by each user and consequently there are at least *q*! possible mappings.

The number of possible codewords of the error-correcting code is extremely large and typically there can be 10<sup>500</sup> possible codewords. The number of valid codewords in the subset of codewords is typically less than 10<sup>2</sup> and so the brute force chance of a randomly selected codeword being a valid codeword is <sup>1</sup> <sup>10500</sup> . Even if an attacker, or an eavesdropper enters a valid codeword, the information that is obtained is encrypted and the confidential information cannot be retrieved without the encryption key, which requires the user's credentials.

One possible application of the system is as an information retrieval app on a smartphone with encrypted information stored in the cloud. For each registered user a cloud-based message server has stored a list of cryptographic hashes of valid codewords of the error-correcting code and an associated list of encrypted messages or files. The mapping of password characters to codeword GF symbols is carried out within the user's smartphone and is not able to be easily accessed by an eavesdropper or an attacker unless the smartphone is stolen along with user login credentials. Additionally, the decryption of received encrypted messages is also carried out within the user's mobile phone. To access a long, hard to remember PIN or a long sequence of cryptic characters, the user can enter the password, which is mapped to a GF symbol stream, which is automatically corrected by the smartphone before cryptographic hashing. The hash is encrypted, using a random session key exchanged using public key cryptography, before being transmitted by the smartphone to the cloud. This is to prevent replay attacks. If the codeword hash is correct, the cloud transmits the corresponding encrypted message or file, together with an acknowledgement. The user's smartphone receives the cipher text and decrypts it into the requested information.

#### **18.2 Details of the Password System**

A block diagram of the system showing how user defined passwords are mapped to sequences of GF symbols, encoded into codewords of an error-correcting code and associated with encrypted confidential information is shown in Fig. 18.1.

We consider as an example, a system using passwords consisting of sequences of up to 256 characters long with characters taken from the ANSI (American National Standards Institute) single byte character set and an error-correcting code which is a Reed–Solomon (RS) error-correcting code [2] described in Chaps. 7 and 11. RS codes are MDS codes, constructed from GF(q) field elements. For the finite field case, *q* is a prime or a power of a prime. In this example *q* = 2<sup>8</sup> and RS codewords are constructed as sequences of GF(256) symbols. Codewords can be designed to be any length up to *q* + 1 symbols long if the doubly extended version of the RS code is utilised.

In general, any character set may be used in the system and any RS code may be used provided the sequence of characters is less than or equal to the length of the error-correcting code and each symbol of the error-correcting code is from an alphabet size equal or greater than the alphabet size of the character set used to define passwords. For maximum security, the mapping is chosen by a cryptographic random number generator, with a seed provided by the user so that there is a high probability that the resulting mapping table is unique to each user of the information retrieval system.

It is convenient to use a binary base field and the Galois Field [3], GF(256), that is used is an extension field consisting of 8 binary GF(2) field elements, generated by residues of α*<sup>n</sup>*, *n* = 0 to 255 modulo 1 + *x* <sup>2</sup> + *x* <sup>3</sup> + *x* <sup>4</sup> + *x* 8, where 1 + *x* <sup>2</sup> + *x* <sup>3</sup> + *x* <sup>4</sup> + *x* <sup>8</sup> is an example of a primitive polynomial, plus the zero symbol *G F*(0).

As an example, the registered password "silver" is considered, whose corresponding sequence of ANSI numbers is

#### 115 105 108 118 101 114

As shown in Fig. 18.1, a mapping table is used to map these numbers to GF(256) symbols. In this example, the error-correcting code that is used is the (256, 254, 3) extended RS code which is capable of correcting either two erased symbols or one erroneous symbol, and the code has 254 information symbols and 2 parity-check symbols. The first two symbols are chosen as parity-check symbols and denoted as *p*<sup>1</sup> and *p*2, respectively. Putting the parity symbols first is convenient because short codewords can easily by accommodated by assuming any unused information symbols have value zero and therefore do not affect the parity symbols. A general

codeword of this code as an extended GF(256) RS code is

$$\begin{array}{ccccccccc}p\_1 & p\_2 & \times\_1 & \times\_2 & \times\_3 & \times\_4 & \dots & \times\_{254} \end{array}$$

The general parity-check matrix of an extended RS code with *n* − *k* parity-check symbols is

$$\mathbf{H} = \begin{bmatrix} 1 & 1 & 1 & \dots & 1 & 1 \\ 1 & \alpha^1 & \alpha^2 & \dots & \alpha^{n-1} & 0 \\ 1 & \alpha^2 & \alpha^4 & \dots & \alpha^{2(n-1)} & 0 \\ 1 & \alpha^3 & \alpha^6 & \dots & \alpha^{3(n-1)} & 0 \\ \vdots & \dots & \dots & \dots & \dots & \dots \\ 1 & \alpha^{n-k-1} & \alpha^{2(n-k-1)} & \dots & \alpha^{n-k-1(n-1)} & 0 \\ \end{bmatrix}$$

To provide more flexibility in symbol mapping, as described below, the generalised extended RS code may be used with parity-check matrix **H**η.

$$\mathbf{H}\_{\eta} = \begin{bmatrix} \eta\_{0} & \eta\_{1} & \eta\_{2} & \dots \eta\_{n-1} & \eta\_{n} \\ \eta\_{0} & \eta\_{1}\alpha^{1} & \eta\_{2}\alpha^{2} & \dots \eta\_{n-1}\alpha^{n-1} & 0 \\ \eta\_{0} & \eta\_{1}\alpha^{2} & \eta\_{2}\alpha^{4} & \dots \eta\_{n-1}\alpha^{2(n-1)} & 0 \\ \eta\_{0} & \eta\_{1}\alpha^{3} & \eta\_{2}\alpha^{6} & \dots \eta\_{n-1}\alpha^{3(n-1)} & 0 \\ \vdots & \dots & \dots & \dots & \dots & \dots \\ \eta\_{0} & \eta\_{1}\alpha^{n-k-1} & \eta\_{2}\alpha^{2(n-k-1)} & \dots \eta\_{n-1}\alpha^{n-k-1(n-1)} & 0 \\ \end{bmatrix}$$

The constants η1, η2, η3, ... η*<sup>n</sup>* may be arbitrarily chosen provided they are nonzero symbols of *GF*(*q*).

With two parity-check symbols, only the first two rows of **H** are needed and we may conveniently place the last column first to obtain the reduced echelon paritycheck matrix **H2**

$$\mathbf{H}\_2 = \left[ \begin{array}{cccc} 1 & 1 & 1 & 1 & \dots & 1 \\ 0 & 1 & \alpha^1 & \alpha^2 & \dots & \alpha^{n-1} \\ \end{array} \right]$$

Any pseudo random, one to one, mapping of ANSI numbers to GF(256) symbols may be used. It is convenient to map always the null character, ANSI number = 32, to the field element *G F*(0) otherwise each password would consist of 256 characters and 256 password characters would have to be entered for each password. With the null character mapping, a shortened RS codeword is equal to the full length codeword since any of the *G F*(0) symbols may be deleted without affecting the parity-check symbols. Consequently, short passwords may be accommodated very easily.

It is possible to choose a fixed one to one mapping of ANSI numbers to GF(256) symbols and make this equal for all users but in this case many passwords on first registering would fail as non-valid passwords, unless arbitrarily assigned characters are allowed in the parity symbol positions. However, this is an unnecessary constraint on the system since codewords and passwords of different users are processed independently from each other. Moreover, security is enhanced if each user uses a different mapping.

In the following example, dynamic mapping is used and the mapping chosen is such that the information symbols of the RS codeword corresponding to "silver" are equal to a primitive root α raised to the power corresponding to the ANSI number of the character of the password in the same respective position as the codeword, except for the null character which is set to *G F*(0). As the codeword has parity symbols in the first two positions, and these symbols are a function of the other symbols in the codeword, (the information symbols), the mapping of the first two characters needs to be different. Accordingly, the codeword is

$$p\_1 \quad p\_2 \quad \alpha^{108} \quad \alpha^{118} \quad \alpha^{101} \quad \alpha^{114} \quad 0 \quad \dots \quad 0$$

From the parity-check matrix **H2**, the parity-check symbols are given by

$$p\_2 = \sum\_{i=1}^{254} \alpha^i x\_i \tag{18.1}$$

$$p\_1 = \sum\_{i=1}^{254} x\_i + p\_2 \tag{18.2}$$

After substituting into Eq. (18.2) and then Eq. (18.1), it is found that *p*<sup>1</sup> = α<sup>220</sup> and *p*<sup>2</sup> = α<sup>57</sup> and the complete RS codeword corresponding to the defined password "silver" is

$$\alpha^{220} \quad \alpha^{57} \quad \alpha^{108} \quad \alpha^{118} \quad \alpha^{101} \quad \alpha^{114} \quad 0 \quad \dots \quad 0^7$$

The RS codeword encoder is shown in Fig. 18.1 and uses the mapping of defined password characters to GF(256) symbols as input and outputs to the mapping table, as shown in Fig. 18.1, the mapping of the parity symbols. Accordingly, the mapping of the first two characters of the password is that ANSI number 115 is mapped to α<sup>220</sup> and ANSI number 220 is mapped to α<sup>115</sup> and the mapping table is updated accordingly, as indicated in Fig. 18.1 by the two directional vectors. Of course in order for these mappings to be valid, neither ANSI number 115, nor ANSI number 220, nor GF(256) symbols α220, nor α<sup>115</sup> must be already mapped otherwise the mapping of prior defined passwords will be affected.

As new passwords are defined and new valid codewords calculated, it is relatively straightforward to amend the list of assigned and unassigned mappings of the mapping table in a dynamic manner. This dynamic mapping assignment is a feature in that it not only increases the range of possible passwords but has a secondary advantage. This secondary advantage arises from the mapping of entered passwords and the subsequent error-correction decoding. Any entered password character not having an assigned ANSI number cannot be part of a valid password and accordingly the corresponding GF(256) symbol is marked as an erased symbol. Since on average, twice as many erased characters can be corrected by an error-correcting code compared to the number of correctable erroneous characters, a distinct advantage arises.

The confidential information corresponding to "silver" is, for example:

"The safe combination is 29 53 77 22" and as shown in Fig. 18.1 confidential information input to the system is encrypted using an encryption key. The encryption key is usually chosen from a master encryption key, unique to each user. Once input, the confidential information is only retained in encrypted form. The encrypted confidential information associated with "silver" forms the encrypted text:

```
AjelMHjq+iw&ndˆfh)y!"16f@h:G#)P7=3Mq|2=0+YX?z/+6sGs+2|Zl
```
As shown in Fig. 18.2, in order to retrieve this confidential information, the user re-enters their password. However, this entered password is allowed to contain errors. For example the password "solver" may be entered and has the corresponding ANSI number sequence:

115 111 108 118 101 114 ... 32 32 32

Following input of the password, as shown in Fig. 18.2, the mapping table is used to map the entered password into the GF(256) sequence

$$\alpha^{220} \quad \alpha^{88} \quad \alpha^{108} \quad \alpha^{118} \quad \alpha^{101} \quad \alpha^{114} \quad 0 \dots \quad 0$$

The decoder for the RS code, as shown in Fig. 18.2, decodes the sequence of GF(256) symbols, resulting from the mapping using the mapping table, into a codeword of the error-correcting code. In order to carry this out, two syndromes, *s*<sup>1</sup> and *s*2, are calculated from the two parity-check equations for the extended (256, 254, 3) RS code:

$$s\_1 = \sum\_{i=1}^{254} x\_i + p\_2 + p\_1 \tag{18.3}$$

$$s\_2 = \sum\_{i=1}^{254} \alpha^i x\_i + p\_2 \tag{18.4}$$

The two syndromes, in this case, are found both to be equal to α<sup>43</sup> indicating there is an error in the second information symbol position of the codeword and this error is

α43. If the same error had been in the third symbol position, say, the two syndromes would have been equal to α<sup>43</sup> and α44.

Subtracting the error, α43, from the entered symbol (after mapping) of α<sup>88</sup> produces the correct GF(256) symbol α57. The codeword of the RS code is thus corrected to be

$$
\alpha^{220} \quad \alpha^{88} \quad \alpha^{108} \quad \alpha^{118} \quad \alpha^{101} \quad \alpha^{114} \quad 0 \dots \quad 0
$$

Applying the inverse mapping to each GF(256) symbol produces the ANSI number sequence 115 105 108 118 101 114 32 32 32 32…32

115 105 108 118 101 114 32 32 32 32 ... 32

This corresponds to "silver", the corrected entered password.

As shown in Fig. 18.2, the decoded codeword is compared to the list of valid codewords of the error-correcting code. The codewords of the error-correcting code are split into two groups, the valid codewords and rest of the codewords, invalid codewords. The codeword

$$\alpha^{220} \quad \alpha^{88} \quad \alpha^{108} \quad \alpha^{118} \quad \alpha^{101} \quad \alpha^{114} \quad 0 \dots \quad 0$$

is verified as a valid codeword associated with the encrypted confidential information:

AjelMHjq+iw&ndˆfh)y!"16f@h:G#)P7=3Mq|2=0+YX?z/+6sGs+2|Zl-GW p<)g/,HDZ)H4D7F/j+gFAqYlFcXZPMY6\$3"/

As shown in Fig. 18.2, this is decrypted, using the encryption key and the confidential information is output: "The safe combination is 29 53 77 22"

In a further extension of the system, as shown in Fig. 18.3, the encoded RS codeword, denoted as **cx** which results from the mapped, defined password is convolved with a fixed RS codeword denoted as

$$\mathbf{y}\_{\mathbf{x}} = \boldsymbol{\alpha}^{y\_0} + \boldsymbol{\alpha}^{y\_1}\boldsymbol{\alpha} + \boldsymbol{\alpha}^{y\_2}\boldsymbol{\alpha}^2 + \boldsymbol{\alpha}^{y\_3}\boldsymbol{\alpha}^3 + \cdots \boldsymbol{\alpha}^{y\_{258}}\boldsymbol{\alpha}^{254}$$

Note that the standard polynomial-based RS codes of length *q* − 1 are used in this system variation. The fixed RS codeword is the result of encoding a random set of GF(256) information symbols. The reason for doing this is to ensure that the resulting codeword after convolution, **rx**

$$\mathbf{r}\_{\mathbf{x}} = \mathbf{c}\_{\mathbf{x}} \mathbf{y}\_{\mathbf{x}} \text{ modulo } 1 + x^{2\mathfrak{H}} \tag{18.5}$$

does not have a long sequence of *G F*(0)symbols which may compromise the security of the information retrieval system. Correspondingly, it is the codeword **rx** which is associated with the encrypted message. As shown in Fig. 18.4, retrieval of encrypted

information is carried out by entering a password. After the decoding of the sequence of GF(256) symbols resulting from the mapping of the entered password, using the mapping table, into a codeword of the error-correcting code, this codeword is convolved with the fixed codeword as shown in Fig. 18.4. The resulting codeword is compared to the list of valid codewords of the error-correcting code.

One feature, particularly with long passwords hard to remember, is that a partially known password may be entered, deliberately using characters known not to be contained in the mapping table, in order for the system to fill in the missing parts of the password. Characters may be reserved for this purpose. As a simple example, the password may be entered "si\*\*er" where it is known that the character \* will not be contained in the mapping table, because the character \* had been previously defined as a reserved character. The corresponding codeword is

$$\alpha^{220} \quad \alpha^{88} \quad erase\_1 \quad erase\_2 \quad \alpha^{101} \quad \alpha^{114} \quad 0 \dots \quad 0$$

where *erase*<sup>1</sup> and *erase*<sup>2</sup> represent erased (unknown) GF(256) symbols. The decoder for the RS (256,254,3) error-correcting code may be used to solve straightforwardly for these erased symbols. The first step is to produce a reduced echelon parity-check matrix with zeros in the columns corresponding to the positions of the erased symbols, bar one. The procedure is described in detail in Chap. 11.

For two erasures the procedure is trivial and the reduced echelon parity-check matrix **He** is

$$\mathbf{H\_{e}} = \left[ \begin{array}{cccc} 1 & 1 & 1 & 1 & \dots & 1 \\ \alpha^{1} \ (1 + \alpha^{1}) \ 0 \ (\alpha^{2} + \alpha^{1}) \ \dots \ (\alpha^{n-1} + \alpha^{1}) \end{array} \right]$$

Now, the erased symbol *erase*<sup>2</sup> may be solved directly using the second row of **He**

$$a^{220}a^1 + a^{88}(1+a^1) + 
varepsilon\_2(a^2+a^1) + a^{101}(a^3+a^1) + a^{114}(a^3+a^1) = 0$$

and

$$erase\_2 = (a^2 + a^1)^{q-2}a^{221} + a^{88} + a^{89} + a^{104} + a^{102} + a^{117} + a^{115} = a^{118}$$

Using the first row of **He**, *erase*<sup>1</sup> can now be solved

$$
\alpha^{220} + \alpha^{88} + \alpha^{118} + \epsilon \iota s e s\_1 + \alpha^{101} + \alpha^{114} = 0
$$

and

$$erase\_1 = \alpha^{220} + \alpha^{88} + \alpha^{118} + \alpha^{101} + \alpha^{114} = \alpha^{108}$$

With the reverse mapping the complete password is reconstituted allowing the encrypted information to be retrieved as before.

The advantage of defining erasures is that each unknown symbol may be solved for each parity symbol in the RS codeword. Having a relatively large number of parity symbols allows several parts of the entered password to be filled in automatically. Obviously security is compromised if this procedure is used to extreme.

#### **18.3 Summary**

This chapter has described the use of Reed–Solomon codes to correct user mistakes or missing parts of long entered passwords. The system is ideally suited to a smartphonebased encrypted, information retrieval system or a password-based authentication system. Dynamic, user-specific mapping of Galois field elements is used to ensure that passwords, arbitrarily chosen by the user, are valid codewords. A typical system is described based on GF(256) and the ANSI character set with worked examples given. Security is also enhanced by having, user-specific, Galois field symbol mapping because, with long passwords, this defeats Rainbow tables.

#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the book's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the book's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Chapter 19 Variations on the McEliece Public Key Cryptoystem**

#### **19.1 Introduction and Background**

In 1978, the distinguished mathematician Robert McEliece invented a public key encryption system [8] based upon encoding the plaintext as codewords of an errorcorrecting code from the family of Goppa [6] codes. In this system, the ciphertext, sometimes termed the cryptogram, is formed by adding a randomly chosen error pattern containing up to *t* bits to each codeword. One or more such corrupted codewords make up the ciphertext. On reception, the associated private key is used to invoke an error-correcting decoder based upon the underlying Goppa code to correct the errored bits in each codeword, prior to retrieval of the plaintext from all of the information bits present in the decoded codewords.

Since the original invention there have been a number of proposed improvements. For example, in US Patent 5054066, Riek and McFarland improved the security of the system by complementing the error patterns so as to increase the number of errors contained in the cryptogram [14] and cited other variations of the original system.

This chapter is concerned with a detailed description of the original system plus some refinements which enhance the bandwidth efficiency and security of the original arrangement. The security strength of the system is discussed and analysed.

# *19.1.1 Outline of Different Variations of the Encryption System*

In the originally proposed system [8] a codeword is generated from plaintext message bits by using a permuted, scrambled generator matrix of a Goppa code [6] of length *n* symbols, capable of correcting *t* errors. This matrix is the public key. The digital cryptogram is formed from codewords corrupted by exactly *t* randomly, or *t* pseudorandomly, chosen bit errors. The security is provided by the fact that it is impossible to remove the unknown bit errors unless the original unpermuted Goppa code, the private key, is known in which case the errors can be removed by correcting them and then descrambling the information bits in the codeword to recover the original message. Any attempt to descramble the information bits without removing the errors first just results in a scrambled mess. In the original paper by McEliece [8], the Goppa codeword length *n* is 1024 and *t* is 50. The number of possible error combinations is 3.19 × 10<sup>85</sup> equivalent to a secret key of length 284 bits given a brute force attack. (There are more sophisticated attacks which reduce the equivalent secret key length and these are discussed later in this chapter.)

In a variation of the original theme, after first partitioning the message into message vectors of length *k* bits each and encoding these message vectors into codewords, the codewords are corrupted by a combination of bit errors and bit deletions to form the cryptogram. The number of bit errors in each corrupted codeword is not fixed, but is an integer *s*, which is randomly chosen, with the constraint that, *s* ≤ *t*. This increases the number of possible error combinations, thereby increasing the security of the system. As a consequence 2(*t* − *s*) bits may be deleted from each codeword in random positions adding to the security of the cryptogram as well as reducing its size, without shortening the message. In the case of the original example, above, with *t* <sup>2</sup> ≤ *s* ≤ *t* the number of possible error combinations is increased to 3.36×10<sup>85</sup> and the average codeword in the cryptogram is reduced to 999 bits from 1024 bits.

Most encryption systems are deterministic in which there is a one-to-one correspondence between the message and the cryptogram with no random variations. Security can be improved through the use of a truly, random integer generator, not a pseudorandom generator to form the cryptogram. Consequently, the cryptogram is not predictable or deterministic. Even with the same message and public key, the cryptogram produced will be different each time and without knowledge of the random errors and bit deletions, which may be determined only by using the structure of the Goppa code, recovery of the original message is practically impossible.

The basic McEliece encryption system has little resistance to chosen-plaintext (message) attacks. For example, if the same message is encrypted twice and the two cryptograms are added modulo 2, the codeword of the permuted Goppa code cancels out and the result is the sum of the two error patterns. Clearly the encryption method does not provide indistinguishability under chosen-plaintext attack (IND-CPA), a quality measure used by the cryptographic community.

However, an additional technique may be used which does provide (IND-CPA) and results in semantically secure cryptograms. The technique is to scramble the message twice by using a second scrambler. With scrambling the message using the fixed non-singular matrix contained in the public key as well, a different scrambler is used to scramble each message in addition. The scrambling function of this second scrambler is derived from the random error vector which is added to the codeword to produce the corrupted codeword after encoding using the permuted, scrambled generator matrix of a Goppa code. As the constructed digital cryptogram is a function of truly randomly chosen vectors, not pseudorandomly chosen vectors, or a fixed vector, the security of this public key encryption system is enhanced compared to the standard system. Even with an identical message and using exactly the same public key, the resulting cryptograms will have no similarity at all to any previously generated cryptograms. This is not true for the standard McEliece public key system as each codeword will only differ in a maximum of 2*t* bit positions. Providing this semantic security eliminates the risk from known plaintext attacks and is useful in several applications such as in RFID, and these are discussed later in the chapter.

An alternative to using a second scrambler is to use a cryptographic hash function such as SHA-256 [11] or SHA-3 [12] to calculate the hash of each *t* bit error pattern and add, modulo 2, the first k bits of the hash values to the message prior to encoding. Effectively the message is encrypted with a stream cipher prior to encoding.

Having provided additional message scrambling, it now becomes safe to represent the generator matrix in reduced echelon form, i.e. a *k* × *k* identity matrix followed by a (*n* − *k*) × *k* matrix for the parity bits. Consequently, the public key may be reduced in size from a *n* × *k* matrix to a (*n* − *k*) × *k* matrix corresponding typically to a reduction in size of around 65%. This is useful because one of the criticisms of the McEliece system is the relatively large size of the public keys.

Most attacks on the McEliece system are blind attacks and rely on the assumption that there are exactly *t* errors in each corrupted codeword. If there are more than *t* errors these attacks fail. Consequently, to enhance the security of the system, additional errors known only to intended recipients may be inserted into the digital cryptogram so that each corrupted codeword contains more than *t* errors. A sophisticated method of introducing the additional errors is not necessary since provided there are sufficient additional errors to defeat decryption based on guessing the positions of the additional errors the message is theoretically unrecoverable from the corrupted digital cryptogram even with knowledge of the private key. This feature may find applications where a message needs to be distributed to several recipients using the same or different public/private keys at the same time, possibly in a commercial, competitive environment. The corrupted digital cryptograms may be sent to each recipient arriving asynchronously, due to variable network delays and only a relatively short secret key containing information of the additional error positions needs to be sent at the same time to all recipients.

In another arrangement designed to enhance the security of the system, additional errors are inserted into each codeword in positions defined by a position vector, which is derived from a cryptographic hash of the previous message vector. Standard hash functions may be used such as SHA-256 [11] or SHA-3 [12]. The first message vector can use a position vector derived from a hash or message already known by the recipient of the cryptogram.

These arrangements may be used in a wide number of different applications such as active and passive RFID, secure barcodes, secure ticketing, magnetic cards, message services, email applications, digital broadcasting, digital communications, video communications and digital storage. Encryption and decryption is amenable to high speed implementation operating at speeds beyond 1 Gbit/s.

#### **19.2 Details of the Encryption System**

The security strength of the McEliece public key encryption system stems from the fact that a truly random binary error pattern is added to the encoded message as part of the digital cryptogram. Even with the same message and the same public key a different digital cryptogram is produced each time. Each message is encoded with a scrambled, binary mapped, permuted, version of a *GF*(2*<sup>m</sup>*) Goppa code. Without the knowledge of the particular Goppa code that is used, the error pattern cannot be corrected and the message cannot be recovered. It is not possible to deduce which particular Goppa code is being used from the public key, which is the matrix used for encoding, because this matrix is a scrambled, permuted version of the original encoding matrix of the Goppa code, plus the fact that for a given *m* there are an extremely large number of Goppa codes [8].

The message information to be sent, if not in digital form, is digitally encoded into binary form comprising a sequence of information bits. The method of encryption is shown in Fig. 19.1. The message comprising a sequence of information bits is formatted by appending dummy bits as necessary into an integral number *m* of binary message vectors of length *k* bits each. This is carried out by *format into message vectors* shown in Fig. 19.1. Each message vector is scrambled and encoded into a codeword, *n* bits long, defined by an error-correcting code which is derived from a binary Goppa code and a scrambling matrix. The binary Goppa code is derived from a non-binary Goppa code and the procedure is described below for a specific example.

The *encode using public key* shown in Fig. 19.1 carries out the scrambling and codeword encoding for each message vector by selecting rows of the codeword generator matrix according to the message bits contained in the message vector. This operation is described in more detail below for a specific example. The codeword generator matrix to be used for encoding is defined by the public key which is stored in a buffer memory, *public key* shown in Fig. 19.1. As shown in Fig. 19.1, a random number generator generates a number*s*internally constrained to be less than or equal to *t* and this is carried out by *generate number of random errors (s)*. The parameter *t* is the number of bit errors that the Goppa code can correct.

**Fig. 19.1** Public key encryption system with *s* random bit errors and 2(*t* − *s*) bit deletions

**Fig. 19.2** Random integer generator of the number of added, random bit errors, *s*

The number of random errors *s* is input to *generate random errors* which for each codeword, initialises an *n* bit buffer memory with zeros, and uses a random number generator to generate *s* 1's in *s* random positions of the buffer memory. The contents of the *n* bit buffer are added to the codeword of *n* bits by *add* shown in Fig. 19.1. The 1's are added modulo 2 which inverts the codeword bits in these positions so that these bits are in error. In Fig. 19.1, *t* − *s erasures* takes the input *s*, calculates 2(*t*−*s*) and outputs this value to *position vector* which comprises a buffer memory of *n* bits containing a sequence of integers corresponding to a position vector described below. The first 2(*t* − *s*) integers are input to *delete bits* which deletes the bits in the corresponding positions of the codeword so that 2(*t* − *s*) bits of the codeword are deleted. The procedure is carried out for each codeword so that each codeword is randomly shortened due to deleted bits and corrupted with a random number of bit errors in random positions. In Fig. 19.1, *format cryptogram* has the sequence of shortened corrupted codewords as input and appends these together, together with formatting information to produce the cryptogram.

The highest level of security is provided when the block *generate number of random errors (s)* of Fig. 19.1 is replaced by a truly random number generator and not a pseudorandom generator. An example of a random number generator is shown in Fig. 19.2.

The differential amplifier with high gain amplifies the thermal noise generated by the resistor terminated inputs. The output of the amplifier is the amplified random noise which is input to a comparator which carries out binary quantisation. The comparator output is 1 if the amplifier output is a positive voltage and 0 otherwise. This produces 1's and 0's with equal probability at the output of the comparator. The output of the comparator is clocked into a shift register having *p* shift register stages, each of delay *T*. The clock rate is <sup>1</sup> *<sup>T</sup>* . After *p* clock cycles, the contents of the shift register represent a number in binary which is the random number *s* having a uniform probability distribution between 0 and 2*<sup>p</sup>* − 1.

One or more of the bits output from the shift register may be permanently set to 1 to provide a lower limit to the random number of errors *s*. As an example, if the 4th bit (counting from the least significant bits) is permanently set to 1 then *s* has a uniform probability distribution between 2<sup>3</sup> = 8 and 2*<sup>p</sup>* − 1.

Similarly, the highest level of security is provided if the positions of the errors generated by *generate random errors* of Fig. 19.1 is a truly random number generator and not a pseudorandom generator. An example of an arrangement which generates

**Fig. 19.3** Random integer generator of error positions

truly random positions in the range of 0 to 2*<sup>m</sup>* − 1 corresponding to the codeword length is shown in Fig. 19.3.

As shown in Fig. 19.3, the differential amplifier, with high gain amplifies the thermal noise generated by the resistor terminated inputs. The output of the amplifier is the amplified random noise which is input to a comparator which outputs a 1 if the amplifier output is a positive voltage and a 0 otherwise. This produces 1's and 0's with equal probability at the output of the comparator. The output of the comparator is clocked into a flip-flop clocked at <sup>1</sup> *<sup>T</sup>* , with the same clock source as the shift register shown in Fig. 19.3,*shift register*. The output of the flip-flop is a clocked output of truly random 1's and 0's which is input to a nonlinear feedback shift register arrangement.

The output of the flip-flop is input to a modulo 2, adder *add* which is added to the outputs of a nonlinear mapping of *u* selected outputs of the shift register. Which outputs are to be selected correspond to the key which is being used. The parameter *u* is a design parameter, typically equal to 8.

The nonlinear mapping *nonlinear mapping* shown in Fig. 19.3 has a pseudorandom one-to-one correspondence between each of the 2*<sup>u</sup>* input states to each of the 2*<sup>u</sup>* output states. An example of such a one to one correspondence, for *u* = 4 is given in Table 19.1. For example, the first entry, 0000, value 0 is mapped to 0011, value 3.

The shift register typically has a relatively large number of stages, 64 is a typical number of stages and a number of tapped outputs, typically 8. The relationship between the input of the shift register *ain* and the tapped outputs is usually represented by the delay operator *D*. Defining the tap positions as *wi*, for *i* = 0 to *imax*, the input to the nonlinear mapping *nonlinear mapping* shown in Fig. 19.3, defined as *xi* for *i* = 0 to *imax*, is

$$\alpha\_i = a\_{in} D^{w\_i} \tag{19.1}$$

and the output *yj* after the mapping function, depicted as *M* is

$$\mathbf{y}\_{j} = M[\mathbf{x}\_{i}] = M[a\_{in}D^{w\_{i}}] \tag{19.2}$$

The input to the shift register is the output of the adder given by the sum of the random input *Rnd* and the summed output of the mapped outputs. Accordingly,


$$a\_{in} = R\_{nd} + \sum\_{j=0}^{i\_{\text{max}}} \mathbf{y}\_j = R\_{nd} + \sum\_{j=0}^{i\_{\text{max}}} M[\mathbf{x}\_i] = R\_{nd} + \sum\_{j=0}^{i\_{\text{max}}} M[a\_{in} D^{w\_i}] \tag{19.3}$$

It can be seen that the shift register input *ain* is a nonlinear function of delayed outputs of itself added to the random input *Rnd* , and so will be a random binary function.

The positions of the errors are given by the output of *m-bit input* shown in Fig. 19.3, an *m* bit memory register and defined as *epos*. Consider that the first m outputs of the shift register are used as the input to *m-bit input*. The output of *m-bit input* is a binary representation of a number given by

$$e\_{pos} = \sum\_{j=0}^{m-1} \mathcal{2}^j \times a\_{in} D^j \tag{19.4}$$

Since *ain* is a random binary function, *epos* will be an integer between 0 and 2*<sup>m</sup>* − 1 randomly distributed with a uniform distribution. As shown in Fig. 19.3, these randomly generated integers are stored in memory in *error positions buffer memory* after *eliminate repeats* has eliminated any repeated numbers, since repeated integers will occur from time to time in any independently distributed random integer generator.

The random bit errors and bit deletions can only be corrected with the knowledge of the particular non-binary Goppa code, the private key, which is used in deriving the codeword generator matrix. Reviewing the background on Goppa codes: Goppa defined a family of codes [6] where the coordinates of each codeword {*c*0, *c*1, *c*2,... *c*2*m*−<sup>1</sup>} with {*c*<sup>0</sup> = *x*0, *c*<sup>1</sup> = *x*1, *c*<sup>2</sup> = *x*2,... *c*2*m*−<sup>1</sup> = *x*2*m*−<sup>1</sup>} satisfy the congruence *p*(*z*) modulo *g*(*z*) = 0 where *g*(*z*) is now known as the Goppa polynomial and *p*(*z*) is the Lagrange interpolation polynomial.

Goppa codes have coefficients from *GF*(2*<sup>m</sup>*) and provided *g*(*z*) has no roots which are elements of *GF*(2*<sup>m</sup>*) (which is straightforward to achieve) the Goppa codes have parameters (2*<sup>m</sup>*, *k*, 2*<sup>m</sup>* − *k* + 1). Goppa codes can be converted into binary codes. Provided that *g*(*z*) has no roots which are elements of *GF*(2*<sup>m</sup>*) and has no repeated roots, the binary code parameters are (2*<sup>m</sup>*, 2*<sup>m</sup>* − *mt*, *dmin*) where *dmin* ≥ 2*t* + 1, the Goppa code bound on minimum Hamming distance. Most binary Goppa codes have equality for the bound and *t* is the number of correctable errors.

For a Goppa polynomial of degree *r*, there are *r* parity check equations defined from the congruence. Denoting *g*(*z*) by

$$\log(z) = \mathbf{g}\_r z^r + \mathbf{g}\_{r-1} z^{r-1} + \mathbf{g}\_{r-2} z^{r-2} + \dots + \mathbf{g}\_1 z + \mathbf{g}\_0 \tag{19.5}$$

$$\sum\_{i=0}^{2^m - 1} \frac{c\_i}{z - \alpha\_i} = 0 \quad \text{modulo } \mathbf{g}(z) \tag{19.6}$$

Since Eq. (19.6) is modulo *g*(*z*) then *g*(*z*) is equivalent to 0, and we can add *g*(*z*) to the numerator. Dividing each term *z* − α*<sup>i</sup>* into 1 + *g*(*z*) produces the following

$$\frac{g(z) + 1}{z - \alpha\_i} = q\_i(z) + \frac{r\_m + 1}{z - \alpha\_i} \tag{19.7}$$

where *rm* is the remainder, an element of *GF*(2*<sup>m</sup>*) after dividing *g*(*z*) by *z* − α*i*.

As *rm* is a scalar, *g*(*z*) may simply be pre-multiplied by <sup>1</sup> *rm* so that the remainder cancels with the other numerator term which is 1.

$$\frac{\frac{q(z)}{r\_m} + 1}{z - \alpha\_i} = \frac{q\_i(z)}{r\_m} + \frac{\frac{r\_m}{r\_m} + 1}{z - \alpha\_i} = \frac{q(z)}{r\_m} \tag{19.8}$$

As

$$\mathbf{g}(z) = (z - \alpha\_i)q\_i(z) + r\_m \tag{19.9}$$

When *z* = α*i*, *rm* = *g*(α*i*).

Substituting for *rm* in Eq. (19.8) produces

$$\frac{\frac{g(z)}{g(\alpha\_i)} + 1}{z - \alpha\_i} = \frac{q\_i(z)}{g(\alpha\_i)}\tag{19.10}$$

Since *<sup>g</sup>*(*z*) *<sup>g</sup>*(α*i*) modulo *g*(*z*) = 0

$$\frac{1}{z - \alpha\_i} = \frac{q\_i(z)}{\mathbf{g}(\alpha\_i)}\tag{19.11}$$

The quotient polynomial *qi*(*z*)is a polynomial of degree *r*−1 with coefficients which are a function of α*<sup>i</sup>* and the Goppa polynomial coefficients. Denoting *qi*(*z*) as

$$q\_i(z) = q\_{i,0} + q\_{i,1}z + q\_{i,2}z^2 + q\_{i,3}z^3 + \dots + q\_{i,(r-1)}z^{r-1} \tag{19.12}$$

Since the coefficients of each power of *z* sum to zero the *r* parity check equations are given by

$$\sum\_{i=0}^{2^{n}-1} \frac{c\_{i}q\_{i,j}}{g(\alpha\_{i})} = 0 \quad \text{for} \quad j = 0 \quad \text{to} \quad r - 1 \tag{19.13}$$

If the Goppa polynomial has any roots which are elements of *GF*(2*<sup>m</sup>*), say α*j*, then the codeword coordinate *cj* has to be permanently set to zero in order to satisfy the parity check equations. Effectively the codelength is shortened by the number of roots of *g*(*z*) which are elements of *GF*(2*<sup>m</sup>*). Usually the Goppa polynomial is chosen to have distinct roots which are not in *GF*(2*<sup>m</sup>*).

The security depends upon the number of bit errors added and in practical examples to provide sufficient security, it is necessary to use long Goppa codes of length 2048 bits, 4096 bits or longer. For brevity, the procedure will be described using an example of a binary Goppa code of length 32 bits capable of correcting 4 bit errors. It is important to note that all binary Goppa codes are derived from non-binary Goppa codes which are designed first.

In this example, the non-binary Goppa code consists of 32 symbols from the Galois field *GF*(2<sup>5</sup>) and each symbol takes on 32 possible values with the code capable of correcting two errors. There are 28 information symbols and 4 parity check symbols. (It should be noted that when the Goppa code is used with information symbols restricted to binary values as in a binary Goppa code, twice as many errors can be corrected). The non-binary Goppa code has parameters of a (32, 28, 5) code. There are 4 parity check symbols defined by the 4 parity check equations and the Goppa polynomial has degree 4. Choosing arbitrarily as the Goppa polynomial, the polynomial 1 + *z* + *z*<sup>4</sup> which has roots only in *GF*(16) and none in *GF*(32), we determine *qi*(*z*) by dividing by *z* − α*i*.

$$q\_i(z) = z^3 + \alpha\_i z^2 + \alpha\_i^2 z + (1 + \alpha\_i^3) \tag{19.14}$$

The 4 parity check equations are

$$\sum\_{i=0}^{31} \frac{c\_i}{\mathbf{g}(\alpha\_i)} = \mathbf{0} \tag{19.15}$$

$$\sum\_{i=0}^{31} \frac{c\_i \alpha\_i}{g(\alpha\_i)} = 0 \tag{19.16}$$

474 19 Variations on the McEliece Public Key Cryptoystem

$$\sum\_{i=0}^{31} \frac{c\_i \alpha\_i^2}{\mathbf{g}(\alpha\_i)} = 0 \tag{19.17}$$

$$\sum\_{i=0}^{31} \frac{c\_i (1 + \alpha\_i^3)}{g(\alpha\_i)} = 0 \tag{19.18}$$

Using the *GF*(25) Table 19.2 to evaluate the different terms for *GF*(25), the parity check matrix is

$$\mathbf{H}\_{(32,28,5)} = \begin{bmatrix} 1 & 1 & \alpha^{14} \stackrel{\alpha^{28}}{\alpha^{20}} \stackrel{\alpha^{20}}{\alpha^{23}} \stackrel{\alpha^{25}}{\alpha^{29}} \dots \alpha^{10} \\ 0 & 1 & \alpha^{15} \stackrel{\alpha^{30}}{\alpha^{10}} \stackrel{\alpha^{23}}{\alpha^{26}} \stackrel{\alpha^{29}}{\alpha^{22}} \dots \alpha^{8} \\ 0 & 1 & \alpha^{16} \stackrel{\alpha^{1}}{\alpha^{24}} \stackrel{\alpha^{26}}{\alpha^{8}} \stackrel{\alpha^{17}}{\alpha^{17}} \dots \alpha^{8} \end{bmatrix} \tag{19.19}$$

To implement the Goppa code as a binary code, the symbols in the parity check matrix are replaced with their m-bit binary column representations of each respective *GF*(2*<sup>m</sup>*) symbol. For the (32, 28, 5) Goppa code above, each of the 4 parity symbols will be represented as a 5-bit symbol from Table 19.2. The parity check matrix will now have 20 rows for the binary code. The minimum Hamming distance of the binary Goppa code is improved from *r* + 1 to 2*r* + 1. Correspondingly, the example binary Goppa code becomes a (32, 12, 9) code with parity check matrix:

**H**(**32**, **<sup>12</sup>**, **<sup>9</sup>**) = ⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ 111001 ... 1 000100 ... 0 001110 ... 0 001011 ... 0 001101 ... 1 011011 ... 0 001110 ... 1 001010 ... 0 001011 ... 1 001100 ... 1 011110 ... 1 001010 ... 0 000011 ... 1 001000 ... 1 001010 ... 0 100011 ... 1 001101 ... 0 001110 ... 1 001100 ... 0 000101 ... 0 ⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ (19.20)

**Table 19.2** *GF*(32) non-zero extension field elements defined by 1 <sup>+</sup> <sup>α</sup><sup>2</sup> <sup>+</sup> <sup>α</sup><sup>5</sup> <sup>=</sup> <sup>0</sup>

$$\begin{aligned} a^1 &= 1\\ a^2 &= a^2\\ a^3 &= a^3\\ a^4 &= a^4\\ a^4 &= a^4\\ a^5 &= a^4\\ a^6 &= a^6 + a^3\\ a^7 &= a^2 + a^2\\ a^8 &= a^2 + a^2 + a^3\\ a^9 &= a^9 + a^3 + a^4\\ a^{10} &= 1 + a^4\\ a^{11} &= 1 + a^4\\ a^{12} &= a^2 + a^2 + a^3\\ a^{13} &= a^2 + a^3 + a^4\\ a^{14} &= a^4 + a^2 + a^3 + a^4\\ a^{15} &= 1 + a + a^2 + a^3 + a^4\\ a^{16} &= 1 + a + a^2 + a^3 + a^4\\ a^{17} &= 1 + a + a^2\\ a^{18} &= a^{10} + a^{14}\\ a^{19} &= a^{12} + a^{14}\\ a^{21} &= a^1 + a^2 + a^4\\ a^{22} &= 1 + a^2 + a^4\\ a^{23} &= a^1 + a^2 + a^3 + a^4\\ a^{24} &= a^2 + a^2 + a^3\\ a^{25} &= 1 + a + a^2 + a^4\\ a^{26} &= 1 + a + a^2\\ a^{27} &= 1 + a + a^2\\ a^{28} &= a^2 + a^4\\ a^{29} &= a + a^2 + a^4\\ a^{28} &= a + a^2 + a^4\\ a^{29} &= a + a^2 + a^4\\ a^{29} &= a + a^2 + a^4\\ a^{28} &= a + a^2 + a^4\\ a^{29} &= a + a^2 + a^4\\ a^{29} &= a + a^2 + a^4\\ a^{28} &= a + a^2 + a^4\\ a^{29} &= a + a^2 + a^4\\ a^{29} &= a + a^2 + a^4\\ a^{2$$

The next step is to turn the parity check matrix into reduced echelon form by using elementary matrix row and column operations so that there are 20 rows representing 20 independent parity check equations for each parity bit. From the reduced echelon parity check matrix, the generator matrix can be obtained straightforwardly as it is the transpose of the reduced echelon parity check matrix. The resulting generator matrix is:

**G**(**32**, **<sup>12</sup>**, **<sup>9</sup>**) = ⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ 10000000000011000111101001110110 010000000000110101 10000110101100 00100000000000110011000011010010 00010000000011111101010110001010 00001000000011000011010011100001 00000100000011111001100010101001 00000010000000001011111100011100 00000001000000111001100111000100 00000000100011110111111101111101 00000000010011101101111001001011 00000000001000011100111011001100 00000000000111101100100110110111 ⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ (19.21)

It will be noticed that the generator matrix is in reduced echelon form and has 12 rows, one row for each information bit. Each row is the codeword resulting from that information bit equal to a 1, all other information bits equal to 0.

The next step is to scramble the information bits by multiplying by a *k* × *k* nonsingular matrix, that is one that is invertible. As a simple example, the following 12 × 12 matrix is invertible.

$$\mathbf{NS}\_{\mathbf{12\times12}} = \begin{bmatrix} 0 \ 1 \ 1 \ 1 \ 0 \ 1 \ 0 \ 0 \ 1 \ 1 \ 1 \ 0 \\ 0 \ 0 \ 1 \ 1 \ 1 \ 0 \ 1 \ 0 \ 0 \ 1 \ 1 \ 1 \\ 1 \ 0 \ 0 \ 1 \ 1 \ 1 \ 0 \ 1 \ 0 \ 0 \ 1 \ 1 \\ 1 \ 1 \ 0 \ 0 \ 1 \ 1 \ 1 \ 0 \ 1 \ 0 \ 0 \ 1 \\ 1 \ 1 \ 1 \ 0 \ 0 \ 1 \ 1 \ 1 \ 0 \ 1 \ 0 \ 0 \\ 0 \ 1 \ 1 \ 1 \ 0 \ 0 \ 1 \ 1 \ 1 \ 0 \ 1 \ 0 \\ 0 \ 0 \ 1 \ 1 \ 1 \ 0 \ 0 \ 1 \ 1 \ 1 \ 0 \ 1 \\ 1 \ 0 \ 0 \ 1 \ 1 \ 1 \ 0 \ 0 \ 1 \ 1 \ 1 \ 0 \\ 0 \ 1 \ 0 \ 0 \ 1 \ 1 \ 1 \ 0 \ 0 \ 1 \ 1 \ 1 \\ 1 \ 0 \ 1 \ 0 \ 0 \ 1 \ 1 \ 1 \ 0 \ 0 \ 1 \ 1 \\ 1 \ 0 \ 1 \ 0 \ 0 \ 1 \ 1 \ 1 \ 0 \ 0 \ 1 \ 1 \\ 1 \ 1 \ 0 \ 1 \ 0 \ 0 \ 1 \ 1 \ 1 \ 0 \ 0 \ 1 \\ 1 \ 1 \ 1 \ 0 \ 1 \ 0 \ 0 \ 1 \ 1 \ 1 \ 0 \ 0 \ 1 \\ 1 \ 1 \ 1 \ 0 \ 1 \ 0 \ 0 \ 1 \ 1 \ 1 \ 0 \ 0 \ 1 \\ 1 \ 1 \ 1 \ 0 \ 1 \ 0 \ 0 \ 1 \ 1 \ 1 \ 0 \ 0 \$$

The above is invertible using the following matrix:

$$\mathbf{NS}\_{12\times12}^{-1} = \begin{bmatrix} 1 \ 1 \ 0 \ 1 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \\ 0 \ 1 \ 1 \ 0 \ 1 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \\ 0 \ 0 \ 1 \ 1 \ 0 \ 1 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \\ 0 \ 0 \ 0 \ 1 \ 1 \ 0 \ 1 \ 0 \ 0 \ 0 \ 0 \ 0 \\ 0 \ 0 \ 0 \ 0 \ 1 \ 1 \ 0 \ 1 \ 0 \ 0 \ 0 \ 0 \\ 0 \ 0 \ 0 \ 0 \ 0 \ 1 \ 1 \ 0 \ 1 \ 0 \ 0 \\ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 1 \ 1 \ 0 \ 1 \ 0 \ 0 \\ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 1 \ 1 \ 0 \ 1 \ 0 \\ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 1 \ 1 \ 0 \ 1 \ 0 \\ 1 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 1 \ 1 \ 0 \\ 0 \ 1 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 1 \ 1 \ 0 \\ 1 \ 0 \ 1 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 1 \ 1 \\ 1 \ 0 \ 1 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 1 \ 1 \\ 1 \ 0 \ 1 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 1 \ 1 \end{bmatrix} \tag{19.23}$$

The next step is to scramble the generator matrix with the non-singular matrix to produce the scrambled generator matrix given below. The code produced with this generator matrix has the same codewords as the generator matrix given by matrix (19.21) and can correct the same number of errors but there is a different mapping to codewords from a given information bit pattern.

$$\mathbf{SG}(\mathbf{32,12,9}) = \begin{bmatrix} 0 \ 1 \ 1 \ 1 \ 0 \ 1 \ 0 \ 0 \ 1 \ 1 \ 1 \ 0 \ 1 \ 1 \ 1 \ 0 \ 0 \ 1 \ 0 \ 1 \ 1 \ 1 \ 1 \ 1 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 1 \ 1 \ 1 \ 1 \ 0 \ 1 \ 0 \ 1 \ 0 \ 1 \ 0 \ 1 \ 0 \ 1 \ 0 \ 1 \ 0 \ 1 \ 0 \ 1 \ 0 \ 1 \ 0 \ 1 \ 0 \ 1 \ 0 \ 1 \ 0 \ 0 \ 0 \ 1 \ 0 \ 1 \ 0 \ 0 \ 0 \ 0 \ 1 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0$$

It may be seen that, for example, the first row of this matrix is the modulo 2 sum of rows 1, 2, 3, 5, 8, 9 and 10 of matrix (19.21) in accordance with the non-singular matrix (19.22).

The final step in producing the public key generator matrix for the codewords from the message vectors is to permute the columns of the matrix above. Any permutation may be randomly chosen. For example we may use the following permutation:

27 15 4 2 19 21 17 14 7 16 20 1 29 8 11 12 25 5 30 24 6 18 13 3 0 26 23 28 22 31 9 10 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 (19.25)

so that for example column 0 of matrix (19.24) becomes column 24 of the permuted generator matrix and column 31 of matrix (19.24) becomes column 29 of the permuted generator matrix. The resulting, permuted generator matrix is given below.

**PSG**(**32**, **<sup>12</sup>**, **<sup>9</sup>**) = ⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ 00011111001111010110001100001111 11111000011010100000100101001111 00101100111000110110001110110101 01101101011111101100110010001000 10010110100100000111110010001010 10011100101111010000111101011101 10110100101011111000001100110010 00100110001011001110010110010011 01100001011100111101111000111011 0001111011101011011 0101010011001 00001110101101111 010101110001000 101100001011010100 11001011011110 ⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ (19.26)

With this particular example of the Goppa code, the message needs to be split into message vectors of length 12 bits, adding padding bits as necessary so that there is an integral number of message vectors. As a simple example of a plaintext message, consider that the message consists of a single message vector with the information bit pattern:

{0, 1, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1}

Starting with an all 0's vector, where the information bit pattern is 1, the corresponding row from the permuted, scrambled matrix, matrix (19.26) with the same position is added modulo 2 to the result so far to produce the codeword which will form the digital cryptogram plus added random errors. In this example, this codeword is generated from adding modulo 2, rows 2, 4, 5, 6 and 12 from the permuted, scrambled matrix, matrix (19.26) to produce:

00011111001111010110001100001111 ++++++++++++++++++++++++++++++++ 001011001110001101 10001110110101 ++++++++++++++++++++++++++++++++ 01101101011111101100110010001000 ++++++++++++++++++++++++++++++++ 10010110100100000111110010001010 ++++++++++++++++++++++++++++++++ 101100001011010100 11001011011110 -------------------------------- 01111000100001011000001001100110 (19.27)

The resulting codeword is:

## {01111000100001011000001001100110}

This Goppa code can correct up to 4 errors, (*t* = 4), so a random number is chosen for the number of bits to be in error (*s*) and 2(*t*−*s*) bits are deleted from the codeword in pre-determined positions. The pre-determined positions may be given by a secret key, a position vector, known only to the originator and intended recipient of the cryptogram. It may be included as part of the public key, or may be contained in a previous cryptogram sent to the recipient. An example of a position vector, which defines the bit positions to be deleted is:

{19, 3, 27, 17, 8, 30, 11, 15, 2, 5, 19, ..., 25}.

The notation being, for example, that if there are 2 bits to be deleted, the bit positions to be deleted are the first 2 bit positions in the position vector, 19 and 3. As well as the secret key, the position vector, the recipient needs to know the number of bits deleted, preferably with the information provided in a secure way. One method is for the message vector to contain, as part of the message, a number indicating the number of errors to be deleted in the next codeword, the following codeword (not the current codeword); the first codeword having a known, fixed number of deleted bits.

The number of bit errors and the bit error positions are randomly chosen to be in error. A truly random source such as a thermal noise source as described above produces the most secure results, but a pseudorandom generator can be used instead, particularly, if seeded from the time of day with a fine time resolution such as 1 ms. If the number of random errors chosen is too few, the security of the digital cryptogram will be compromised. Correspondingly, the minimum number of errors chosen is a design parameter depending upon the length of the Goppa code and *t*, the number of correctable errors. A suitable choice for the minimum number of errors chosen in practice lies between *<sup>t</sup>* <sup>2</sup> and <sup>3</sup>*<sup>t</sup>* 4 .

For the example above, consider that the number of bit errors is 2 and these are randomly chosen to be in positions 7 and 23 (starting the position index from 0). The bits in these positions in the codeword are inverted to produce the result:

$$\{0\,1\,1\,1\,1\,0\,0\,1\,1\,0\,0\,0\,1\,0\,1\,1\,0\,0\,0\,0\,1\,1\,0\,1\,1\,0\,1\,1\,0\,0\,1\,1\,0\}.$$

As there are 2 bits in error, 4 bits (2(*t* − *s*) = 2(4 − 2)) may be deleted. Using the position vector example above, the deleted bits are in positions {19, 3, 27 and 17} resulting in 28 bits,

{0111001100001011000110110110}.

This vector forms the digital cryptogram which is transmitted or stored depending upon the application.

The intended recipient of this cryptogram retrieves the message in a series of steps. Figure 19.4 shows the decryption system. The retrieved cryptogram is formatted into corrupted codewords by *format into corrupted codewords* shown in Fig. 19.4. In the formatting process, the number of deleted bits in each codeword is determined from the retrieved length of each codeword. The next step is to insert 0's in the deleted bit positions so that each corrupted codeword is of the correct length. This is carried out using *fill erased positions with 0's* as input, the position vector stored in a buffer memory as *position vector* in Fig. 19.4 and the number of deleted (erased) bits from *format into corrupted codewords*. For the example above, the recipient first receives or otherwise retrieves the cryptogram{0111001100001011000110110110}. Knowing the number of deleted bits and their positions, the recipient inserts 0's in positions {19, 3, 27 and 17} to produce:

**Fig. 19.4** Private key decryption system with *s* random bit errors and 2(*t* − *s*) bit deletions

The private key contains the information of which Goppa code was used, the inverse of the non-singular matrix used to scramble the data and the permutation applied to codeword symbols in constructing the public key generator matrix. This information is stored in *private key* in Fig. 19.4.

For the example, the private key is used to undo the permutation applied to codeword symbols by applying the following permutation:

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 27 15 4 2 19 21 17 14 7 16 20 1 29 8 11 12 25 5 30 24 6 18 13 3 0 26 23 28 22 31 9 10 (19.28)

so that, for example, bit 24 becomes bit 0 after permutation and bit 27 becomes bit 31 after permutation. The resulting, corrupted codeword is:

{00011001110 011110001000101100001}

The permutation is carried out by *permute bits* shown in Fig. 19.4.

The next step is to treat the bits in the corrupted codeword as *GF*(2<sup>5</sup>)symbols and use the parity check matrix, matrix (19.19), from the private key to calculate the syndrome value for each row of the parity check matrix to produce α<sup>28</sup>, α<sup>7</sup>, α<sup>13</sup>, and α19. This is carried out by an errors and erasures decoder as a first step in correcting the errors and erasures. The errors and erasures are corrected by *errors and erasures correction*, which knows the positions of the erased bits from *fill erased positions with 0's* shown in Fig. 19.4.

In the example, the errors and erasures are corrected using the syndrome values to produce the uncorrupted codeword. There are several published algorithms for errors and erasures decoding [1, 13, 16]. Using, for example, the method described by Sugiyama [16], the uncorrupted codeword is obtained:

{10001001110000100111011001100101}

The scrambled information data is the first 12 bits of this codeword:

$${
\{1\,0\,0\,0\,1\,0\,0\,1\,1\,1\,0\,0\}
}$$

The last step is to unscramble the scrambled data using matrix (19.23) to produce the original message after formatting the unscrambled data:

$$\{0, 1, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1\}$$

In Fig. 19.4, *descramble message vectors*take as input the matrix which is the inverse of the non-singular matrix stored in *private key* and output the descramble message vectors to *format message*.

In practice, much longer codes of length *n* would be used than described above. Typically *n* is set equal to 1024, 2048, 4096 bits or longer. Longer codes are more secure but the public key is larger and encryption and decryption take longer time.

Consider an example with *n* = 1024, correcting *t* = 60 bit errors with a randomly chosen irreducible Goppa polynomial of degree 60, say, *g*(*z*) = 1+*z*+*z*<sup>2</sup>+*z*<sup>23</sup>+*z*60.

Setting the number of inserted bit errors *s* as a randomly chosen number from 40 to 60, the number of deleted bits correspondingly, is 2(*t* − *s*), ranging from 40 to 0 and the average codeword length is 994 bits. There are 9.12×10<sup>96</sup> different bit error combinations providing security, against naive brute force decoding, equivalent to a random key of length 325 bits. The message vector length is 424 bits per codeword of which 6 bits may be assigned to indicate the number of deleted bits in the following codeword. It should be noted that there are more effective attacks than brute force decoding as discussed in Sect. 19.5.

As another example with *n* = 2048 and correcting *t* = 80 bit errors with a randomly chosen irreducible Goppa polynomial of degree 80, an example being *g*(*z*) = 1 + *z* + *z*<sup>3</sup> + *z*<sup>17</sup> + *z*80.

Setting the number of inserted bit errors *s* as a randomly chosen number from 40 to 80, the number of deleted bits correspondingly, is 2(*t*−*s*), ranging from 80 to 0 and the average codeword length is 2008 bits. There are 2.45 × 10<sup>144</sup> different bit error combinations providing security, against naive brute force decoding, equivalent to a random key of length 482 bits. The message vector length is 1168 bits per codeword of which 7 bits may be assigned to indicate the number of deleted bits in the following codeword.

In a hybrid arrangement where the sender and recipient share secret information, additional bits in error may be deliberately added to the cryptogram using a secret key, the position vector to determine the positions of the additional error bits. The number of additional bits in error is randomly chosen between 0 and *n* − 1. The recipient needs to know the number of additional bits in error (as well as the position vector), preferably with this information provided in a secure way. One method is for the message vector to contain, as part of the message, the number of additional bits in error in the next codeword that is the following codeword (not the current codeword). It is arranged that the first codeword has a known, fixed number of additional bits in error.

As each corrupted codeword contains more than *t* bits in error, it is theoretically impossible, even with the knowledge of the private key to recover the original codewords free from errors and to determine the unknown bits in the deleted bit positions. It should be noted that this arrangement defeats attacks based on information set decoding, which is discussed later. The system is depicted in Fig. 19.5.

This encryption arrangement is as shown in Fig. 19.1 except that the system accommodates additional errors added by *generate additional errors* shown in Fig. 19.5 using a random integer generator between 0 and *n*−1 generated by *generate random number of additional errors*. Any suitable random integer generator may be used. For example, the random integer generator design shown in Fig. 19.2 may be used with the number of shift register stages *p* now set equal to *m*, where *n* = 2*<sup>m</sup>*.

**Fig. 19.5** Public key encryption system with *s*random bit errors, 2(*t*−*s*) bit deletions and a random number of additional errors

Additional errors may be added in the same positions as random errors, as this provides for a simpler implementation or may take account of the positions of the random errors. However, there is no point in adding additional bit errors to bits which will be subsequently deleted.

As shown in Fig. 19.5, the number of additional errors is communicated to the recipient as part of the message vector in the preceding codeword with the information included with the message. This is carried out by *format into message vectors* shown in Fig. 19.5. In this case, usually 1 or 2 more message vectors in total will be required to convey the information regarding numbers of additional errors and the position vector (if this has not been already communicated to the recipient). Clearly, there are alternative arrangements to communicate the numbers of additional errors to the recipient such as using a previously agreed sequence of numbers or substituting a pseudorandom number generator for the truly random number generator (*generate random number of additional errors* shown in Fig. 19.5) with a known seed.

Using the previous example above, with the position vector:

{19, 3, 27, 17, 8, 30, 11, 15, 2, 5, 19, ..., 25}

The errored bits are in positions 7 and 23 (starting the position index from 0) and the deleted bits are in positions {19, 3, 27 and 17}. The encoded codeword prior to corruption is:

## {01111000100001011000001001100110}

The number of additional bits in error is randomly chosen to be 5, say. As the first 4 positions (index 0–3) in the position vector are to be deleted bits, starting from index 4, the bits in codeword positions {8, 30, 11, 15, and 2} are inverted in addition to the errored bits in positions 7 and 23. The 32 bit corrupted codeword is produced: {01011001000101001000001101100100}.

The bits in positions{19, 3, 27 and 17} are deleted to produce the 28 bit corrupted codeword:

$$\{0\,1\,0\,1\,0\,0\,1\,0\,0\,1\,0\,1\,0\,1\,0\,0\,1\,0\,0\,1\,1\,0\,1\,1\,0\,1\,0\,1\,0\}\}$$

The additional bits in error are removed by the recipient of the cryptogram prior to errors and erasures correction as shown in Fig. 19.6. The number of additional bits in error in the following codewords is retrieved from the descrambled message vectors by *format message* shown in Fig. 19.6 and input to *number of additional errors* which outputs this number to *generate additional errors* which is the same as in Fig. 19.5. The position vector is stored in a buffer memory in *position vector* and outputs this to *generate additional errors*. Each additional error is corrected by the adder *add*, shown in Fig. 19.6, which adds, modulo 2, a 1 which is output from *generate additional errors* in the same position of each additional error. Retrieval of the message from this point follows correction of the errors and erasures, descrambling and formatting as described for Fig. 19.5.

Using the number of deleted bits and the position vector, 0's are inserted in the deleted bit positions to form the 32 bit corrupted codeword:

{01001001000101001000001101100100}

After the addition of the output from *generate additional errors* the bits in positions {8, 30, 11, 15, and 2} are inverted, thereby correcting the 5 additional errors to form the less corrupted codeword:

{01101001100001011000001101100110}

As in the first approach, this corrupted codeword is permuted, the syndromes calculated and the errors plus erasures corrected to retrieve the original message:

{0, 1, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1}

**Fig. 19.6** Private key decryption system with *s* random bit errors, 2(*t* − *s*) bit deletions and a random number of additional errors

**Fig. 19.7** Position vector generated by hash of message vector and nonlinear feedback shift register

In a further option, the position vector, instead of being a static vector, may be derived from a cryptographic hash of a previous message vector. Any standard cryptographic hash function may be used such as SHA-256 [11] or SHA-3 [12] as shown in Fig. 19.7. The message vector of length *k* bits is hashed using SHA-256 or SHA-3 to produce a binary hash vector of length 256 bits.

For example, the binary hash vector may be input to a nonlinear feedback shift register consisting of *shift register* having *p* stages, typically 64 stages with outputs determined by *select taps* enabling different scrambling keys to be used by selecting different outputs. The nonlinear feedback shift register arrangement to produce a position vector in *error positions buffer memory* is the same as that of Fig. 19.3 whose operation is described above.

As the hash vector is clocked into the nonlinear feedback shift register of Fig. 19.7, a derived position vector is stored in *error positions buffer memory*, and used for encrypting the message vector as described above. The current message vector is encrypted using a position vector derived from the hash of the previous message vector. As the recipient of the cryptogram has decrypted the previous message vector, the recipient of the cryptogram can use the same hash function and nonlinear feedback shift register to derive the position vector in order to decrypt the current corrupted codeword. There are a number of arrangements that may be used for the first codeword. For example, a static position vector, known only to the sender and recipient of the cryptogram could be used or alternatively a position vector derived from a fixed hash vector known only to the sender and recipient of the cryptogram or the hash of a fixed message known only to the sender and recipient of the cryptogram. A simpler arrangement may be used where the shift register has no feedback so that the position vector is derived directly from the hash vector. In this case the hash function needs to produce a hash vector ≥*n*, the length of the codeword.

As discussed earlier, the original McEliece system is vulnerable to chosenplaintext attacks. If the same message is encrypted twice, the difference between the two cryptograms is just 2*t* bits or less, the sum of the two error patterns. This vulnerability is completely solved by encrypting or scrambling the plaintext prior to the McEliece system, using the error pattern as the key. To do this, the random error pattern needs to be generated first before the codeword is constructed by encoding with the scrambled generator matrix.

This scrambler which is derived from the error vector, for each message vector, may be implemented in a number of ways. The message vector may be scrambled by multiplying by a *k* × *k* non-singular matrix derived from the error vector.

Alternatively, the message vector may be scrambled by treating the message vector as a polynomial *m*1(*x*) of degree *k* − 1 and multiplying it by a circulant polynomial *p*1(*x*) modulo 1 + *x<sup>k</sup>* which has an inverse [7]. The circulant polynomial *p*1(*x*) is derived from the error vector. Denoting the inverse of the circulant polynomial *p*1(*x*) as *q*1(*x*) then

$$p\_1(\mathbf{x})q\_1(\mathbf{x}) = 1 \text{ modulo } 1 + \mathbf{x}^k \tag{19.29}$$

Accordingly the scrambled message vector is *m*1(*x*)*p*1(*x*) which is encoded into a codeword using the scrambled generator matrix. Each message vector is scrambled in a different way as the error patterns are random and different from corrupted codeword to corrupted codeword. The corrupted codewords form the cryptogram.

On decoding of each codeword, the corresponding error vector is obtained with retrieval of the scrambled message vector. Considering the above example, the circulant polynomial *p*1(*x*) is derived from the error vector and the inverse *q*1(*x*) is calculated using Euclid's method [7] from *p*1(*x*). The original message vector is obtained by multiplying the retrieved scrambled message vector *m*1(*x*)*p*1(*x*) by *p*1(*x*) because

$$m\_1(\mathbf{x})p\_1(\mathbf{x})q\_1(\mathbf{x}) = m\_1(\mathbf{x}) \text{ modulo } 1 + \mathbf{x}^k \tag{19.30}$$

Another method of scrambling each message vector using a scrambler derived from the error vector is to use two nonlinear feedback shift registers as shown in Fig. 19.8. The first operation is for the error vector, which is represented as a *s*-bit sequence is input to a modulo 2 adder *add* whose output is input to *shift register A* as shown in Fig. 19.8. The nonlinear feedback shift registers are the same as in Fig. 19.3 with operation as described above but *select taps* will usually have a different setting and *nonlinear mapping* also will usually have a different mapping, but this is not essential. After clocking the *s*-bit error sequence into the nonlinear feedback shift register, *shift register A* shown in Fig. 19.8 will essentially contain a random binary vector. This vector is used by *define taps* to define which outputs of *shift register B* are to be input to *nonlinear mapping B* whose outputs are added modulo 2 to the message vector input to form the input to *shift register B* shown in Fig. 19.8. The scrambling of the message vector is carried out by a nonlinear feedback shift register whose feedback connections are determined by a random binary vector derived from the error vector, the *s*-bit error sequence.

The corresponding descrambler is shown in Fig. 19.9. Following decoding of each corrupted codeword, having correcting the random errors and bit erasures, the scrambled message vector is obtained and the error vector is in the form of the *s*-bit error sequence. As in the scrambler, the *s* bit error sequence is input to a modulo 2 adder *add* whose output is input to *shift register A* as shown in Fig. 19.9. After clocking the *s*-bit error sequence into the nonlinear feedback shift register, *shift register A* shown in Fig. 19.9 will contain exactly the same binary vector as *shift register A* of Fig. 19.8. Consequently, exactly the same outputs of *shift register B* to

**Fig. 19.8** Message vector scrambling by nonlinear feedback shift register with taps defined by *s*-bit error pattern

**Fig. 19.9** Descrambling independently each scrambled message vector by nonlinear feedback shift register with taps defined by *s*-bit error pattern

be input to *non linear mapping B* will be defined by *define taps*. Moreover, comparing the input of *shift register B* of the scrambler Fig. 19.8 to the input of *shift register B* of the descrambler Fig. 19.9 it will be seen that the contents are identical and equal to the scrambled message vector.

Consequently, the same selected shift register outputs will be identical and with the same nonlinear mapping *nonlinear mapping B* the outputs of *nonlinear mapping B* in Fig. 19.9 will be identical to those that were the outputs of *nonlinear mapping B* in Fig. 19.8. The result of the addition of these outputs modulo 2 with the scrambled message vector is to produce the original message vector at the output of *add* in Fig. 19.9.

This is carried out for each scrambled message vector and associated error vector to recover the original message.

In some applications, a reduced size cryptogram is essential perhaps due to limited communications or storage capacity. For these applications, a simplification may be used in which the cryptogram consists of only one corrupted codeword containing random errors, the first codeword. The following codewords are corrupted by only deleting bits. The number of deleted bits is 2*t* bits per codeword using a position vector as described above.

For example, with *n* = 1024, and the Goppa code correcting *t* = 60 bit errors, there are 2*t* bits deleted per codeword so that apart from the first corrupted codeword, each corrupted codeword is only 904 bits long and conveys 624 message vector bits per corrupted codeword.

A similar approach is to hash the error vector of the first corrupted codeword and use this hash value as the key of a symmetric encryption system such as the Advanced Encryption Standard (AES) [10] and encrypt any following information this way. Effectively, this is AES encryption operating with a random session key since the error pattern is chosen randomly as in the classic, hybrid encryption system.

#### **19.3 Reducing the Public Key Size**

In the original McEliece system, the public key is the *k*×*n* generator matrix which can be quite large. For example with *n* = 2048 and *k* = 1148, the generator matrix needs to be represented by 1148×2048 = 2.35×10<sup>6</sup> bits. Representing the generator matrix in reduced echelon form reduces the generator matrix to *k*×(*n*−*k*) = 1.14×10<sup>6</sup> bits. In the *n* = 32 example above, the generator matrix is the 12×32 matrix, **PSG**(**32**, **<sup>12</sup>**, **<sup>9</sup>**), given by Eq. (19.26). Rows of this matrix may be added together using modulo 2 arithmetic so as to produce a matrix with *k* independent columns. This matrix is a reduced echelon matrix, possibly permuted to obtain *k* independent columns, and may be straightforwardly derived by using the Gauss–Jordan variable elimination procedure. With permutations, there are a large number of possible solutions which may be derived and candidate column positions may be selected, either initially in consecutive order to determine a solution, or optionally, selected in random order to arrive at other solutions.

Consider as an example the first option of selecting candidate column positions in consecutive order. For the **PSG**(**32**, **<sup>12</sup>**, **<sup>9</sup>**) matrix (19.26), the following permuted reduced echelon generator matrix is produced:


The permutation defined by the following input and output bit position sequences is used to rearrange the columns of the permuted, reduced echelon generator matrix.

```
0 1 2 3 4 5 6 7 8 9 12 14 10 11 13 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 (19.32)
```
This permutation produces a classical reduced echelon generator matrix [7], denoted as **Q**(**32**, **<sup>12</sup>**, **<sup>9</sup>**):

**Q**(**32**, **<sup>12</sup>**, **<sup>9</sup>**) = ⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ 10000000000000101101010100011111 01000000000010001001000100101110 00100000000000110010111100110000 00010000000011101010000101000010 00001000000001101111010010001001 00000100000001000100110111111101 00000010000011000010110010011001 00000001000011101111110001100111 00000000100000000110100110110011 00000000010010011100101110010100 00000000001000011010101111000111 00000000000100010101011111010110 ⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ (19.33)

Codewords generated by this matrix are from a systematic code [7] with the first 12 bits being information bits and the last 20 bits being parity bits. Correspondingly, the matrix above, **Q**(**32**, **<sup>12</sup>**, **<sup>9</sup>**) consists of an identity matrix followed by a matrix denoted as **QT**(**32**, **<sup>12</sup>**, **<sup>9</sup>**) which defines the parity bits part of the generator matrix. The transpose of this matrix is the parity check matrix of the code [7]. As shown in Fig. 19.10, the public key consists of the parity check matrix, less the identity submatrix, and a sequence of *n* numbers representing a permutation of the codeword bits after encoding. By permuting the codewords with the inverse permutation, the resulting permuted codewords will be identical to codewords produced by **PSG**(**32**, **<sup>12</sup>**, **<sup>9</sup>**), the public key of the original McEliece public key system [8]. However, whilst the codewords are identical, the information bits will not correspond.

The permutation is defined by the following input and output bit position sequences.

$$\begin{array}{c} 0 \ 1 \ 2 \ 3 \ 4 \ 5 \end{array} \begin{array}{c} 6 \ 7 \ 8 \ 9 \ 12 \ 14 \ 10 \ 11 \ 13 \ 15 \ 16 \ 17 \ 18 \ 19 \ 20 \ 21 \ 22 \ 23 \ 24 \ 25 \ 26 \ 27 \ 28 \ 29 \ 30 \ 31 \ 00 \ 10 \ 12 \ 3 \ 4 \ 5 \end{array} \tag{19.34}$$

**Fig. 19.10** Reduced size public key encryption system

As the output bit position sequence is just a sequence of bits in natural order, the permutation may be defined only by the input bit position sequence.

In this case, the public key consists of an *n* position permutation sequence and in this example the sequence chosen is:

$$\text{ (19.1 2 3 4 5 6 7 8 9 12 14 10 11 13 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31} \tag{19.35}$$

and the *k* × (*n* − *k*) matrix, **QT**(**32**, **<sup>12</sup>**, **<sup>9</sup>**), which in this example is the 12 × 20 matrix:

**QT**(**32**, **<sup>12</sup>**, **<sup>9</sup>**) = ⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ 00101101010100011111 10001001000100101110 00110010111100110000 11101010000101000010 01101111010010001001 01000100110111111101 11000010110010011001 11101111110001100111 00000110100110110011 10011100101110010100 00011010101111000111 00010101011111010110 ⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ (19.36)

The public key of this system is much smaller than the public key of the original McEliece public key system, since as discussed below, there is no need to include the permutation sequence in the public key.

The message is split into message vectors of length 12 bits adding padding bits as necessary so that there is an integral number of message vectors. Each message vector, after scrambling, is encoded as a systematic codeword using **QT**(**32**, **<sup>12</sup>**, **<sup>9</sup>**), part of the public key. Each systematic codeword that is obtained is permuted using the permutation (19.35), the other part of the public key. The resulting codewords are identical to codewords generated using the generator matrix **PSG**(**32**, **<sup>12</sup>**, **<sup>9</sup>**) (19.26), the corresponding public key of the original McEliece public key system, but generated by different messages.

It should be noted that it is not necessary to use the exact permutation sequence that produces codewords identical to that produced by the original McEliece public key system for the same Goppa code and input parameters. As every permutation sequence has an inverse permutation sequence, any arbitrary permutation sequence, randomly generated or otherwise, may be used for the permutation sequence part of the public key. The permutation sequence that is the inverse of this arbitrary permutation sequence is absorbed into the permutation sequence used in decryption and forms part of the private key. The security of the system is enhanced by allowing arbitrary permutation sequences to be used and permutation sequences do not need to be part of the public key.

The purpose of scrambling each message vector using the fixed scrambler shown in Fig. 19.10 is to provide a one-to-one mapping between the 2*<sup>k</sup>* possible message vectors and the 2*<sup>k</sup>* scrambled message vectors such that the reverse mapping, which is provided by the descrambler, used in decryption, produces error multiplication if there are any errors present. For many messages, some information can be gained even if the message contains errors. The scrambler and corresponding descrambler prevents information being gained this way from the cryptogram itself or by means of some error guessing strategy for decryption by an attacker. The descrambler is designed to have the property that it produces descrambled message vectors that are likely to have a large Hamming distance between vectors for input scrambled message vectors which differ in a small number of bit positions.

There are a number of different techniques of realising such a scrambler and descrambler. One method is to use symmetric key encryption such as the Advanced Encryption Standard (AES) [10] with a fixed key.

An alternative means is provided by the scrambler arrangement shown in Fig. 19.11. The same arrangement may be used for descrambling but with different shift register taps and is shown in Fig. 19.12. Denoting each *k* bit message vector as a polynomial *m*(*x*) of degree *k* − 1:

$$m(\mathbf{x}) = m\_0 + m\_1 \mathbf{x} + m\_2 \mathbf{x}^2 + m\_3 \mathbf{x}^3 \cdots + m\_{k-1} \mathbf{x}^{k-1} \tag{19.37}$$

and denoting the tap positions determined by *define taps* of Fig. 19.11 by μ(*x*) where

$$
\mu(\mathbf{x}) = \mu\_0 + \mu\_1 \mathbf{x} + \mu\_2 \mathbf{x}^2 + \mu\_3 \mathbf{x}^3 \cdots + \mu\_{k-1} \mathbf{x}^{k-1} \tag{19.38}
$$

where the coefficients μ<sup>0</sup> through to μ*<sup>k</sup>*−<sup>1</sup> have binary values of 1 or 0.

**Fig. 19.11** Scrambler arrangement

**Fig. 19.12** Descrambler arrangement

The output of the scrambler, denoted by the polynomial, *scram*(*x*), is the scrambled message vector given by the polynomial multiplication

$$
gamma(\mathbf{x}) = m(\mathbf{x}).\mu(\mathbf{x})\text{ modulo } (1+\mathbf{x}^k)\tag{19.39}
$$

The scrambled message vector is produced by the arrangement shown in Fig. 19.11 after *shift register A with k stages* and *shift register B with k stages* have been clocked 2*k* times and is present at the input of *shift register B with k stages* whose last stage output is connected to the adder, *adder* input. The input of *shift register B with k stages* corresponds to the scrambled message vector for the next additional *k* clock cycles, with these bits defining the binary coefficients of *scram*(*x*). The descrambler arrangement is shown in Fig. 19.12 and is an identical circuit to that of the scrambler but with different tap settings. The descrambler is used in decryption.

For *k* = 12 an example of a good scrambler polynomial, μ(*x*) is

$$\mu(\mathbf{x}) = 1 + \mathbf{x} + \mathbf{x}^4 + \mathbf{x}^8 + \mathbf{x}^8 + \mathbf{x}^9 + \mathbf{x}^{11} \tag{19.40}$$

For brevity, the binary coefficients may be represented as a binary vector. In this example, μ(*x*) is represented as {110011001101}. This is a good scrambler polynomial because it has a relatively large number of taps (seven taps) and its inverse, the descrambler polynomial also has a relatively large number of taps (seven taps). The corresponding descrambler polynomial, θ (*x*) is

$$\theta(\mathbf{x}) = 1 + \mathbf{x} + \mathbf{x}^3 + \mathbf{x}^4 + \mathbf{x}^7 + \mathbf{x}^8 + \mathbf{x}^{11} \tag{19.41}$$

which may be represented by the binary vector {110110011001}. It is straightforward to verify that

$$\begin{array}{rcl} \mu(\mathbf{x}) \times \boldsymbol{\theta}(\mathbf{x}) &=& 1 + \mathbf{x}^2 + \mathbf{x}^3 + \mathbf{x}^4 + \mathbf{x}^5 + \mathbf{x}^6 + \mathbf{x}^8 + \mathbf{x}^{10} \\ &+ \mathbf{x}^{14} + \mathbf{x}^{15} + \mathbf{x}^{16} + \mathbf{x}^{17} + \mathbf{x}^{18} + \mathbf{x}^{20} + \mathbf{x}^{22} \\ &=& 1 \text{ modulo } (1 + \mathbf{x}^k) \end{array} \tag{19.42}$$

and so

$$
gamma(\mathbf{x}) \times \theta(\mathbf{x}) = m(\mathbf{x}) \text{ modulo } (\mathbf{l} + \mathbf{x}^k) \tag{19.43}
$$

As a simple example of a message, consider that the message consists of a single message vector with the information bit pattern {0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1} and so:

$$m(\mathbf{x}) = \mathbf{x} + \mathbf{x}^{\mathbf{3}} + \mathbf{x}^{\mathbf{11}} \tag{19.44}$$

This is input to the scrambling arrangement shown in Fig. 19.11. The scrambled message output is *scram*(*x*) = *m*(*x*) × μ(*x*) given by

$$\begin{array}{l} \text{scram}(\mathbf{x}) = (1 + \mathbf{x} + \mathbf{x}^4 + \mathbf{x}^5 + \mathbf{x}^8 + \mathbf{x}^9 + \mathbf{x}^{11}). (\mathbf{x} + \mathbf{x}^3 + \mathbf{x}^5 + \mathbf{x}^{11}) \\ \qquad = \begin{array}{l} \mathbf{x} + \mathbf{x}^2 + \mathbf{x}^5 + \mathbf{x}^6 + \mathbf{x}^9 + \mathbf{x}^{10} + \mathbf{x}^{12} \\ \qquad + \mathbf{x}^3 + \mathbf{x}^4 + \mathbf{x}^7 + \mathbf{x}^8 + \mathbf{x}^{11} + \mathbf{x}^{12} + \mathbf{x}^{14} \\ \qquad + \mathbf{x}^{11} + \mathbf{x}^{12} + \mathbf{x}^{15} + \mathbf{x}^{16} + \mathbf{x}^{19} + \mathbf{x}^{20} + \mathbf{x}^{22} \\ \qquad \qquad \qquad \text{modulo} \quad (1 + \mathbf{x}^{12}) \\ \qquad 1 + \mathbf{x} + \mathbf{x}^5 + \mathbf{x}^6 + \mathbf{x}^9 \end{array} \tag{19.45}$$

and the scrambling arrangement shown in Fig. 19.11 produces the scrambled message bit pattern {1, 1, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0}.

Referring to Fig. 19.10, the next stage is to use the parity check matrix part of the *public key* to calculate the parity bits from the information bits. Starting with an all 0's vector, where the information bit pattern is a 1, the corresponding row from **QT**(**32**, **<sup>12</sup>**, **<sup>9</sup>**) (19.36) with the same position is added modulo 2 to the result so far to produce the parity bits which with the information bits will form the digital cryptogram plus added random errors after permuting the order of the bits. In this example, this codeword is generated from adding modulo 2, rows 1, 2, 6, 7 and 10 of **QT**(**32**, **<sup>12</sup>**, **<sup>9</sup>**) to produce:

$$
\begin{pmatrix}
0 & 1 & 0 & 1 & 1 & 0 & 1 & 0 & 1 & 0 & 0 & 0 & 1 & 1 & 1 & 1 & 1 \\
+ & + & + & + & + & + & + & + & + & + & + & + & + & + & + \\
1 & 0 & 0 & 1 & 0 & 0 & 1 & 0 & 0 & 1 & 0 & 0 & 1 & 1 & 1 & 0 \\
+ & + & + & + & + & + & + & + & + & + & + & + & + & + & + \\
0 & 1 & 0 & 0 & 1 & 0 & 0 & 1 & 1 & 0 & 1 & 1 & 1 & 1 & 0 & 1 \\
+ & + & + & + & + & + & + & + & + & + & + & + & + & + & + & + \\
1 & 1 & 0 & 0 & 0 & 1 & 0 & 1 & 1 & 0 & 0 & 1 & 0 & 0 & 1 & 0 & 0 \\
1 & 0 & 0 & 1 & 1 & 0 & 0 & 1 & 0 & 1 & 1 & 0 & 0 & 1 & 0 & 0 \\
1 & 0 & 0 & 1 & 1 & 0 & 0 & 1 & 1 & 1 & 0 & 0 & 1 & 0 & 0 \\
1 & 0 & 1 & 1 & 1 & 1 & 0 & 1 & 1 & 1 & 0 & 1 & 1 & 0 & 0 & 0 \\
\end{pmatrix}
\tag{19.46}
$$

The resulting systematic code, codeword is:

# {11000110010010111110111011000001}

The last step in constructing the final codeword which will be used to construct the cryptogram is to apply an arbitrary preset permutation sequence. Referring to Fig. 19.10, the operation *assemble n bit codewords from n-k parity bits and k message bits* simply takes each codeword encoded as a systematic codeword and applies the preset permutation sequence.

In this example, the permutation sequence that is used is not chosen arbitrarily but is the permutation sequence that will produce the same codewords as the original McEliece public key system for the same Goppa code and input parameters. The permutation sequence is:

0 1 2 3 4 5 6 7 8 9 12 14 10 11 13 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 (19.47)

The notation is that the 10th bit should move to the 12th position, the 11th bit should move to the 14th position, the 12th bit should move to the 10th position, the 13th bit should move to the 11th position, the 14th bit should move to the 13th position and all other bits remain in their same positions.

Accordingly, the permuted codeword becomes:

# {11000110011001011110111011000001}

and this will be the input to the adder, *add* of Fig. 19.1.

The Goppa code used in this example can correct up to 4 errors, (*t* = 4), and a random number is chosen for the number of bits to be in error, (*s*) with *s* ≤ 4.

A truly random source such as a thermal noise source as described above produces the most secure results, but a pseudorandom generator can be used instead, particularly if seeded from the time of day with fine time resolution such as 1mS. If the number of random errors chosen is too few, the security of the digital cryptogram will be compromised. Correspondingly, the minimum number of errors chosen is a design parameter depending upon the length of the Goppa code and *t*, the number of correctable errors. A suitable choice for the minimum number of errors chosen in practice lies between *<sup>t</sup>* <sup>2</sup> and *t*. If the cryptogram is likely to be subject to additional errors due to transmission over a noisy or interference prone medium such as wireless, or stored and read using an imperfect reader such as in barcode applications, then these additional errors can be corrected as well as the deliberately introduced errors provided the total number of errors is no more than *t* errors.

For such applications typically the number of deliberate errors is constrained to be between *<sup>t</sup>* <sup>3</sup> and <sup>2</sup>*<sup>t</sup>* 3 .

For the example above, consider that the number of bit errors is 3 and these are randomly chosen to be in positions 4, 11 and 27 (starting the position index from 0). The bits in these positions in the codeword are inverted to produce the result

# {110011100111010 111101110110 10001}

The dcryptogram is this corrupted codeword, which is transmitted or stored depending upon the application.

The intended recipient of this cryptogram retrieves the message in a series of steps. Figure 19.13 shows the system used for decryption. The retrieved cryptogram is formatted into corrupted codewords by *format into corrupted codewords* shown in Fig. 19.4. For the example above, the recipient first receives or otherwise retrieves the cryptogram, which may contain additional errors.

{11001110011101011110111011010001}.

It is assumed in this example that no additional errors have occurred although with this particular example one additional error can be accommodated.

The private key contains the information of which Goppa code was used and a first permutation sequence which when applied to the retrieved, corrupted codewords which make up the cryptogram produces corrupted codewords of the Goppa code with the bits in the correct order. Usually, the private key also contains a second

**Fig. 19.13** Private key decryption system

permutation sequence which when applied to the error-corrected Goppa codewords puts the scrambled information bits in natural order. Sometimes the private key also contains a third permutation sequence which when applied to the error vectors found in decoding the corrupted corrected Goppa codewords puts the bit errors in the same order that they were when inserted during encryption. All of this information is stored in *private key* in Fig. 19.13. Other information necessary to decrypt the cryptogram, such as the descrambler required may also be stored in the private key or be implicit.

There are two permutation sequences stored as part of the private key and the decryption arrangement is shown in Fig. 19.13. The corrupted codewords retrieved from the received or read cryptogram are permuted with a first permutation sequence which will put the bits in each corrupted codeword in the same order as the Goppa codewords. In this example, the first permutation sequence stored as part of the private key is:

24 11 3 23 2 17 20 8 13 30 31 14 15 22 7 1 9 6 21 4 10 5 28 26 19 16 25 0 27 12 18 29 (19.48)

This defines the following permutation input and output sequences:

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 27 15 4 2 19 21 17 14 7 16 20 1 29 8 11 12 25 5 30 24 6 18 13 3 0 26 23 28 22 31 9 10 (19.49)

so that for example bit 23 becomes bit 3 after permutation and bit 30 becomes bit 9 after permutation. The resulting, permuted corrupted codeword is:

{11000110101011011111110001111010}

The permutation is carried out by *permute corrupted codeword bits* shown in Fig. 19.13 with the first permutation sequence input from *private key*.

Following the permutation of each corrupted codeword, the codeword bits are in the correct order to satisfy the parity check matrix, matrix (19.19) if there were no codeword bit errors. (In this case all of the syndrome values would be equal to 0). The next step is to treat each bit in each permuted corrupted codeword as a *GF*(2<sup>5</sup>)symbol with a 1 equal to α<sup>0</sup> and a 0 equal to 0 and use the parity check matrix, matrix (19.19), stored as part of *private key* to calculate the syndrome values for each row of the parity check matrix. The syndrome values produced in this example, are respectively α<sup>30</sup>, α<sup>27</sup>, α<sup>4</sup>, and α2. In Fig. 19.13 *error-correction decoder* calculates the syndromes as a first step in correcting the bit errors.

The bit errors are corrected using the syndrome values to produce an error free codeword from the Goppa code for each permuted corrupted codeword. There are many published algorithms for correcting bit errors for Goppa codes, but the most straightforward is to use a BCH decoder as described by Retter [13] because Berlekamp–Massey may then be used to solve the key equation. After decoding, the error free permuted codeword is obtained:

{100001101010110 11110110001110010}

and the error pattern, defined as a 1 in each error position is

{01000000000000000001000000001000}.

As shown in Fig. 19.13 *permute codeword bits* takes the output of *error-correction decoder* and applies the second permutation sequence stored as part of the private key to each corrected codeword.

Working through the example, consider that the following permutation input and output sequences is applied to the error free permuted codeword (the decoded codeword of the Goppa code).

27 15 4 2 19 21 17 14 7 16 20 1 29 8 11 12 25 5 30 24 6 18 13 3 0 26 23 28 22 31 9 10 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 (19.50)

The result is that the scrambled message bits correspond to bit positions:

$$\{0\ 1\ 2\ 3\ 4\ 5\ 6\ 7\ 8\ 9\ 1\ 2\ 14\}$$

from the encryption procedure described above. The scrambled message bits may be repositioned in bit positions:

{0 1 2 3 4 5 6 7 8 9 10 11}

by absorbing the required additional permutations into a permutation sequence defined by the following permutation input and output sequences:

27 15 4 2 19 21 17 14 7 16 29 11 20 8 1 12 25 5 30 24 6 18 13 3 0 26 23 28 22 31 9 10 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 (19.51)

The second permutation sequence which corresponds to this net permutation and which is stored as part of the private key, *private key* shown in Fig. 19.13 is:

```
24 14 3 23 2 17 20 8 13 30 31 11 15 22 7 1 9 6 21 4 12 5 28 26 19 16 25 0 27 10 18 29 (19.52)
```
The second permutation sequence is applied by *permute codeword bits*. Since the encryption and decryption permutation sequences are all derived at the same time in forming the public key and private key from the chosen Goppa code, it is straightforward to calculate and store the net relevant permutation sequences as part of the private key.

Continuing working through the example, applying the second permutation sequence to the error free permuted codeword produces the output of *permute codeword bits*. The first 12 bits of the result will be the binary vector,{110001100100} and it can be seen that this is identical to the scrambled message vector produced from the encryption operation. Represented as a polynomial the binary vector is 1 + *x* + *x*<sup>5</sup> + *x*<sup>6</sup> + *x*9.

As shown in Fig. 19.13, the next step is for the *k* information bits of each permuted error free codeword to be descrambled by *descramble information bits*. In this example, *descramble information bits* is carried out by the descrambler arrangement shown in Fig. 19.12 with *define taps* corresponding to polynomial 1 + *x* + *x*<sup>3</sup> + *x*<sup>4</sup> + *x*<sup>7</sup> + *x*<sup>8</sup> + *x*11.

The output of the descrambler in polynomial form is (1 + *x* + *x*<sup>5</sup> + *x*<sup>6</sup> + *x*<sup>9</sup>).(1 + *x* + *x*<sup>3</sup> + *x*<sup>4</sup> + *x*<sup>7</sup> + *x*<sup>8</sup> + *x*<sup>11</sup>) modulo 1 + *x*12. After polynomial multiplication, the result is (*x* + *x*<sup>3</sup> + *x*<sup>11</sup>) corresponding to the message

# {010100000001}

It is apparent that this is the same as the original plaintext message prior to encryption.

With each cryptogram restricted to contain *s* errors, the cryptosystem as well as providing security, is able automatically to correct *t* − *s* errors occurring in the communication of the cryptogram as shown in Fig. 19.14. It makes no difference to the decryption arrangement of Fig. 19.13, whether the bit errors were introduced deliberately during encryption or were introduced due to errors in transmitting the cryptogram. A correct message is output after decryption provided the total number of bit errors is less than or equal to *t*, the error-correcting capability of the Goppa code used to construct the public and private keys.

As an illustration, a (512, 287, 51) Goppa code of length 512 bits with message vectors of length 287 bits can correct up to 25 bit errors, (*t* = 25). With *s* = 15, 15 bit errors are added to each codeword during encryption. Up to 10 additional bit errors can occur in transmission of each corrupted codeword and the message will be still recovered correctly from the received cryptogram.

**Fig. 19.14** Public key encryption system correcting communication transmission errors

The system will also correct errors in the reading of cryptograms stored in data media. As an example a medium to long range ISO 18000 6B RFID system operating in the 860–930MHz with 2048 bits of user data can be read back from a tag. A (2048, 1388, 121) Goppa code of length 2048 bits with message vectors of length 1388 bits can correct 60 errors, (*t* = 60). With *s* = 25, 25 bit errors are added to the codeword during encryption and this is written to each passive tag as a cryptogram, stored in non-volatile memory. As well as providing confidentiality of the tag contents, up to 35 additional bit errors can be tolerated in reading each passive tag, thereby extending the operational range. The plaintext message, the encrypted tag payload information of 1388 bits will be recovered more reliably with each scanning of the tag.

# **19.4 Reducing the Cryptogram Length Without Loss of Security**

In many applications a key encapsulation system is used. This is a hybrid encryption system in which a public key cryptosystem is used to send a random session key to the recipient and a symmetric key encryption system, such as AES [10] is used to encrypt the following data. Typically a session key is 256 bits long. To provide the same 256 bit security level a code length of 8192 bits needs to be used, with a code rate of 0.86. Security analysis of the McEliece system is provided in Sect. 19.5. The Goppa code is the (8192, 7048, 177) code. There are 7048 information bits available in each codeword, but only 256 bits are needed to communicate the session key. The code could be shortened in the traditional manner by truncating the generator matrix but this will leave less room to insert the *t* errors thereby reducing the security. The obvious question is can the codeword be shortened without reducing the security?

Niederreiter [9] solved this problem by transmitting only the *n*−*k* bits of the syndrome calculated from the error pattern. Niederreiter originally proposed in his paper a system using Generalised Reed–Solomon codes but this scheme was subsequently broken with an attack by Sidelnikov and Shestakov [15]. However their attack fails if binary Goppa codes are used instead of Generalised Reed–Solomon codes and the Niederreiter system is now associated with the transmission of the *n* − *k* parity bits as syndromes of the McEliece system.

It is unfortunate that only a tiny fraction of the 2*<sup>n</sup>*−*<sup>k</sup>* syndromes correspond to correctable error patterns. For the (8192, 7048, 177) Goppa code, it turns out that there are 2<sup>697</sup> correctable syndromes out of the total of 2<sup>1144</sup> syndromes. The probability of an arbitrary syndrome being decoded is 2−447, around 3 × 10−135. This is the limitation of the Niederreiter system. The plaintext message has to be mapped into an error pattern consisting of *t* bits uniformly distributed over *n* bits. Any deterministic method of doing this will be vulnerable to a chosen-plaintext attack. Of course the Niederreiter system can be used to send random messages, first generated as a random error pattern, as in a random session key. However, additional information really needs to be sent as well, such as a MAC, timestamp, sender ID, digital signature or other supplementary information.

**Fig. 19.15** Public key encryption system with shortened codewords

One solution is to use the system shown in Fig. 19.15. The plaintext message, *M*, consisting of a 256 bit random session key concatenated with 512 bits of supplementary information such as a MAC, time stamp and sender ID is encrypted in the encryption module with a key that is a cryptographic hash, such as SHA-3 [12], of the error pattern. The encrypted message is *Me*. The error pattern consists of *t* bit errors randomly distributed over *n* bits. This bit pattern is partitioned into three parts, A, B and C as shown in Fig. 19.16. Using the (8192, 7048, 177) Goppa code, part A covers the first 6280 bits, part B covers the next 768 bits and part C covers the last 1144 bits, the parity bits.

The public key encoder consists of the public key generator matrix in reduced echelon form which is used to encode information bits consisting of part A, concatenated with *Me* as shown in Fig. 19.15. After encoding, the *n* − *k* parity bits of the codeword are *P*(*A*) + *P*(*Me*). The codeword is then shortened by removing the first 6280 bits of the codeword. The error pattern parts B and C are added to the shortened codeword of length 1912 bits to form the ciphertext of length 1912 bits. The format of the various parts of the error pattern, the hash derivation, encryption and codeword are shown in Fig. 19.16.

The principle that this system uses, is that the syndrome of any codeword is zero and that the cryptosystem is linear. The codeword resulting from the encoding of A is {*A* 0 ... 0 *P*(*A*)}, where *P*(*A*) are the parity bits. The sum of syndromes from sections, of this codeword must be zero. Hence:

$$\text{Synrome}(A) + \begin{array}{c} 0 \ \dots \ 0 \end{array} + P(A) = 0$$

A, Encrypted Message Me encoded as codeword


Sum and truncate to produce codeword plus errors without error pattern A

**Fig. 19.16** Format of the shortened codeword and error pattern

As the base field is 2,

$$\text{Synrome}(A) = P(A)$$

Consequently, by including *P*(*A*) instead of the error bits, part A in the ciphertext results in the same syndrome being calculated in the decoder, namely *P*(*A*)+*P*(*B*)+

**Fig. 19.17** Decryption method for the shortened ciphertext

*P*(*C*). Removing error pattern, part A shortens the ciphertext whilst including *P*(*A*) instead requires no additional bits in the ciphertext. Since it is necessary to derive the complete error pattern of length *n* bits in order to decrypt the ciphertext, there is no loss of security from shortening the ciphertext.

The method used to decrypt the ciphertext by using the private key is shown in Fig. 19.17. The received ciphertext is padded with leading zeros to restore its length to 8192 bits. This is then permuted to be in the same order as the Goppa code and the parity check matrix of the Goppa code is used to calculate the syndrome. A Goppa code error-correcting decoder is then used to find the permuted error pattern A, B and C from this syndrome. The most straightforward error-correcting decoder to use is based on Retter's decoding method [13]. This involves calculating a syndrome having 2(*n* − *k*) parity bits from the parity check matrix of *g*<sup>2</sup>(*z*) where *g*(*z*) is the Goppa polynomial of the code, then using the Berlekamp–Massey method to solve the key equation as in a standard BCH decoder to find the error bit positions. It is because the codewords are binary codewords and the base field is 2, that the Goppa code codewords satisfy the parity checks of *g*<sup>2</sup>(*z*) as well as the parity checks of *g*(*z*), since 12 = 1.

As shown in Fig. 19.17, once the error pattern is determined, it is inverse permuted to produce A, B and C which is hashed to produce the decryption key needed to decrypt *Me* back into the plaintext message *M*. Part B of the derived error pattern is added to the *Me* + *B* contained in the received ciphertext to produce *Me* as shown in Fig. 19.17.

#### **19.5 Security of the Cryptosystem**

If we consider the parameters that Professor McEliece originally chose, a code of length 1024 bits correcting 50 errors and 524 information bits, then a brute force attack may be based on guessing the error pattern, adding this to the cryptogram and checking if the result is a valid codeword. Checking if the result is a codeword is easy. By using elementary matrix operations on the public key, the generator matrix, we can turn it into a reduced echelon matrix whose transpose is the parity check matrix. We simply use this parity check matrix to calculate the syndrome of the *n* bit input vector. If the syndrome is equal to zero then the input vector is a codeword.

The maximum number of syndromes that need to be calculated is equal to the number of different error patterns which is *<sup>n</sup> t* <sup>=</sup> <sup>1024</sup> 50 = 3.1985 ≈ 2284. This may be described as being equivalent to a symmetric key encryption system with a key length of 284 bits.

However, there are much more efficient ways of determining the error pattern in the cryptogram. An attack called information set decoding [2] as described by Professor McEliece in his original paper [8], may be used. For any (*n*, *k*, *d*) code, *k* columns of the generator matrix may be randomly selected and using Gauss–Jordan elimination of the rows, there is a probability that a permuted, reduced echelon generator matrix will be obtained which generates the same codeword as the original code. The *k* × *k* sub-matrix resulting from the *k* selected columns needs to be full rank and the probability of this depends on the particular code. For Goppa codes the probability turns out to be the same as the probability of a randomly chosen *k* × *k* binary matrix being full rank. This probability is 0.2887 as described below in Sect. 19.5.1.

Given a cryptogram containing *t* errors an attacker can select *k* bits randomly, construct the corresponding permuted, reduced echelon generator matrix with a chance of 0.29. The attacker then uses the matrix to generate a codeword and finds the Hamming distance between this codeword and the cryptogram. If the Hamming distance is exactly *t* then the cryptogram has been cracked.

For this to happen all of the *k* selected bits from the cryptogram need to be error free. The probability of this is:

$$\prod\_{i=0}^{k-1} \frac{n-t-i}{n-i} = \frac{(n-t)!(n-k)!}{(n-t-k)!n!}$$

Including the chance of 0.29 that the selected matrix has rank *k*, the average number of selections of *k* bits from the cryptogram before the cryptogram is cracked, *Nck* is given by

$$N\_{ck} = \frac{(n-t-k)!n!}{0.29(n-t)!(n-k)!} \tag{19.53}$$

For the original code parameters (1024, 524, 101), *Nck* = 4.78 × 10<sup>16</sup> ≈ 255.

This is equivalent to a symmetric key encryption system with a key length of 55 bits, a lot less than 284 bits, the base 2 logarithm of the number of error combinations.

Using a longer code offers much more security. For example using code parameters (2048, 1300, 137), *Nck* = 1.45 × 10<sup>31</sup> ≈ 2103, equivalent to a symmetric key length of 103 bits.

For code parameters (8192, 5124, 473), with a Goppa code which corrects 236 errors, it turns out that *Nck* = 5.60 × 10<sup>103</sup> ≈ 2344, equivalent to a symmetric key length of 344 bits.

The success of this attack depends upon the code rate. The effect of the code rate, *R* and the security as expressed in the equivalent symmetric key length in bits is shown in Fig. 19.18 for a code length of 2048 bits. The code rate, *R* that maximises *Nck* for a given *n* is tabulated in Table 19.3, together with *t* the number of correctable errors and the equivalent symmetric key length in bits.

# *19.5.1 Probability of a k* **×** *k Random Matrix Being Full Rank*

The probability of a randomly chosen *k*×*k* binary matrix being full rank is a classical problem related to the erasure correcting capability of random binary codes [4, 5].

**Fig. 19.18** Effect of code rate on security for a code length of 2048 bits


For the binary case it is straightforward to derive the probability, *Pk* of a *k* × *k* randomly chosen matrix being full rank by considering the process of Gauss–Jordan elimination. Starting with the first column of the matrix, the probability of finding a 1 in at least one of the rows is (<sup>1</sup> <sup>−</sup> <sup>1</sup> <sup>2</sup> )*<sup>k</sup>* .

Selecting one of these non-zero bit rows, the bit in the second column will be arbitrary and considering the first two bits there are 2<sup>1</sup> linear combinations. As there are 2<sup>2</sup> combinations of two bits, the chances of not finding an independent 2-bit combination in the remaining *<sup>k</sup>* <sup>−</sup> 1 rows are <sup>1</sup> <sup>2</sup>*k*−<sup>1</sup> . Assuming an independent row is found, we next consider the third column and the first three bits. There are 2<sup>2</sup> linear combinations of 3 bits from the two previously found independent rows and there are a total possible 2<sup>3</sup> combinations of 3 bits. The probability of not finding an independent 3-bit pattern in any of the remaining *<sup>k</sup>* <sup>−</sup> 2 rows is ( <sup>2</sup><sup>2</sup> <sup>23</sup> )*<sup>k</sup>*−<sup>2</sup> <sup>=</sup> <sup>1</sup> <sup>2</sup>*k*−<sup>2</sup> .


Proceeding in this way to *k* rows, it is apparent that *Pk* is given by

$$P\_k = \prod\_{i=0}^{k-1} 1 - \left(1 - \frac{1}{2}\right)^{k-i} = \prod\_{i=0}^{k-1} 1 - \frac{1}{2^{k-i}}\tag{19.54}$$

The probability of *Pk* as a function of *k* is tabulated in Table 19.4. The asymptote of 0.288788 is reached for *k* exceeding 18.

#### *19.5.2 Practical Attack Algorithms*

Practical attack algorithms of course need to factor in the processing cost of Gauss– Jordan elimination compared to the problem of constructing different generator matrices. A completely different set of *k* coordinates does not need to be selected each time to generate a different generator matrix as discussed in [2]. Also, even if the *k* selected columns of the generator matrix do not have full rank, usually discarding and adding one or two columns will produce a full rank matrix. It can be shown that on average only 1.6 additional columns are necessary to achieve full rank. Canteaut and Chabaud [3] showed that by including the cryptogram as an additional row in the generator matrix of the (*n*, *k*, 2*t* + 1) code, a code is produced with parameters (*n*, *k*+1, *t*) for which there is only a single codeword with weight *t*, the original error pattern in the cryptogram. In this case algorithms for finding low-weight codewords may be deployed to break the cryptogram. However these low-weight codeword search algorithms are all very similar to the original algorithm aimed at searching for a codeword of the (*n*, *k*, 2*t* +1) code with Hamming distance *t* from the cryptogram. The conclusions of the literature are that information set decoding is an efficient method of attacking the McEliece system but that from a practical viewpoint the system is unbreakable provided the code is long enough. Bernstein et al. [2] give recommended code lengths and their corresponding security, providing similar results to that of Table 19.3.

The standard McEliece system is vulnerable to chosen-plaintext attacks. The encoder is the public key, usually publically available, and the attacker can simply guess the plaintext, construct the corresponding ciphertext and compare this to the target ciphertext. In addition, if the same plaintext message is encrypted twice the sum of the two ciphertexts is a *n* bit vector of 2*t* bits or less.

The standard McEliece system is also vulnerable to chosen-ciphertext attack. Assuming a decryption oracle is available, the attacker inverts two bits randomly in the ciphertext and sends the result to the decryption oracle. With probability *<sup>t</sup>*(*n*−*t*) *<sup>n</sup>*(*n*−1), a different ciphertext will be produced containing exactly *t* errors and the decryption oracle will output the plaintext, breaking the system.

Encrypting the plaintext using a key derived from the error pattern, as described above, defeats all of these attacks.

#### **19.6 Applications**

Public key encryption is attractive in a wide range of different applications, particularly those involving communications because the public keys may be exchanged initially using clear text messages followed by information encrypted using the public keys. The private keys remain private because they do not need to be communicated and the public keys are of no help to an eavesdropper.

An example of an application for the iPhone and iPad using the McEliece public key encryption system is the S2S app pictured in Fig. 19.19. In this app, files are encrypted with users' public keys and stored in the cloud so that they may be shared. Sharing is by means of links that index the encrypted files on the cloud and each user uses their private key to decrypt the shared files.

If the same type of application was implemented using symmetric key encryption, it would be necessary for users to share passwords with all of the associated risks that entails. Using public key encryption avoids these risks. Another application example is the secure Instant Messaging (IM) system, PQChat for the iPhone, iPad and Android devices. There is an option button which shows messages in their received encrypted format, as shown in Fig. 19.20. The application is called PQChat and the name stands for Post-Quantum Chat as the McEliece cryptosystem is relatively immune to attack by a quantum computer, unlike the public key encryption systems in common use today, such as Rivest Shamir Adleman, (RSA) and Elgamal.

As with other public key methods, the system may be used for mutual authentication. Party X sends a randomly chosen nonce *x*1, together with a timestamp to Party Y using Party Y's public key. Party Y returns a randomly chosen nonce *y*1, timestamp and *hash*(*hash*(*x*1, *y*1)) to Party X using Party X's public key. Party X replies with an encrypted timestamp and acknowlegement, using symmetric key cryptography with encryption key *hash*(*x*1, *y*1), the preimage of *hash*(*hash*(*x*1, *y*1)), to Party Y. The session key *hash*(*x*1, *y*1) is used for further exchanges of information, for the duration of the session. A cryptographic hash function is used such as SHA-3 [12] which also has good, second preimage resistance.

It is assumed that the private keys have been kept secret and the association of IDs with public keys has been independently verified. In this case, Party X knows Party Y holds the private key of Y and is the only one able to learn *x*1. Party Y knows Party **Fig. 19.19** S2S application for sharing encrypted files

X holds the private key of X and is the only one able to learn *y*1. Consequently Party X and Party Y are the only ones with knowledge of *x*<sup>1</sup> and *y*1. Using the preimage of *hash*(*hash*(*x*1, *y*1)) as the session key provides added assurance as both *x*<sup>1</sup> and *y*<sup>1</sup> need to be known in order to generate the key, *hash*(*x*1, *y*1). The timestamps prevent replay attacks being used.

## **19.7 Summary**

A completely novel type of public key cryptosystem was invented by Professor Robert McEliece in 1978 and this is based on error-correcting codes using Goppa codes. Other well established public key cryptosystems are based on the difficulty of determining logarithms in finite fields which, in theory, can be broken by quantum computers. Despite numerous attempts by the crypto community, the McEliece system remains unbroken to this day and is one of the few systems predicted to survive attacks by powerful computers in the future. In this chapter, some variations to the McEliece system have been described including a method which destroys the deterministic link between plaintext messages and ciphertexts, thereby providing semantic security. Consequently, this method nullifies the chosen-plaintext attack, of which the classic McEliece is vulnerable. It is shown that the public key size can be reduced and by encrypting the plaintext with a key derived from the ciphertext random error pattern, the security of the system is improved since an attacker has to determine the exact same error pattern used to produce the ciphertext. This defeats a chosen-ciphertext attack in which two random bits of the ciphertext are inverted. The standard McEliece system is vulnerable to this attack. The security of the McEliece system has been analysed and a shortened ciphertext system has been proposed which does not suffer from any consequent loss of security due to shortening. This is important because to achieve 256 bits of security, the security analysis has shown that the system needs to be based on Goppa codes of length 8192 bits. Key encapsulation and short plaintext applications need short ciphertexts in order to be efficient. It is shown that the ciphertext may be shortened to 1912 bits, provide 256 bits of security and an information payload of 768 bits. Some examples of interesting applications that have been implemented on a smartphone in commercial products, such as a secure messaging app and secure cloud storage app, have been described in this chapter.

#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this book are included in the book's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the book's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Chapter 20 Error-Correcting Codes and Dirty Paper Coding**

#### **20.1 Introduction and Background**

In the following we are concerned with impressing information on an independent signal, such as an image or an audio stream with the aim of the additional energy used consistent with reliable detection of the information. Information can even be impressed on background noise with no apparent signal present. A secondary aim is that in impressing the information, the independent signal should suffer a minimal amount of degradation or distortion to the point that in some circumstances the difference is virtually undetectable.

#### **20.2 Description of the System**

The following, for simplicity, is first described in terms of using binary codes and binary information. It is shown later that the method may be generalised to non-binary codes and non-binary information. The independent signal or noise is denoted by the waveform *v*(*t*) and the information carrying signal to be impressed on the waveform *v*(*t*) is denoted by *s*(*t*). The resulting waveform *w*(*t*) is simply given by the sum:

$$
\Delta \mathbf{w}(t) = \mathbf{v}(t) + \mathbf{s}(t) \tag{20.1}
$$

The decoder which is used to determine *s*(*t*) from the received waveform will usually be faced with additional noise, interference and sometimes distortion due to the receiving equipment or the transmission. With no distortion, the input to the decoder is denoted by *r*(*t*) and given by:

$$r(t) = \nu(t) + s(t) + n(t)\tag{20.2}$$

© The Author(s) 2017 M. Tomlinson et al., *Error-Correction Coding and Decoding*, Signals and Communication Technology, DOI 10.1007/978-3-319-51103-0\_20

511

In its simplest form *s*(*t*) carries only one bit of information and

$$\mathbf{s}(t) = k\_0 \mathbf{s}\_0(t) - k\_1 \mathbf{s}\_1(t) \tag{20.3}$$

to convey data 0, and

$$\mathbf{s}(t) = k\_0 \mathbf{s}\_1(t) - k\_1 \mathbf{s}\_0(t) \tag{20.4}$$

to convey data 1.

The multiplicative constants, *k*<sup>0</sup> and *k*<sup>1</sup> are chosen to adjust the energy of the information carrying signal and *k*<sup>1</sup> is used to reduce the correlation of the alternative information carrying signal that could cause an error in the decoder. The multiplicative constants, *k*<sup>0</sup> and *k*<sup>1</sup> are normally chosen as a function of *v*(*t*), the main component of interference in the decoder, which is attempting to decode *r*(*t*).

In conventional communications, *s*0(*t*) (or *s*1(*t*)) is transmitted or stored and *s*0(*t*) (or *s*1(*t*)) is decoded despite the presence of interference or noise. *s*(*t*) is added to *v*(*t*) and *s*0(*t*) (or *s*1(*t*)) is decoded from the composite waveform *v*(*t*) + *s*(*t*) despite the presence of additional interference or noise.

Noting that the transmitter has no control over the independent signal or noise *v*(*t*), a good strategy is to choose *s*0(*t*) and *s*1(*t*)from a large number of possible waveforms in order to produce waveforms which have a large correlation with respect to *v*(*t*). Each possible waveform is constrained to be a codeword from an (*n*, *k*, *dmin*) errorcorrecting code. In one approach using binary codes, the 2*<sup>k</sup>* codewords are partitioned into two disjoint classes, codewords having even parity and codewords having odd parity. The codeword *s*0(*t*) is the even parity codeword with highest correlation out of all even parity codewords and the codeword *s*1(*t*) is the odd parity codeword with highest correlation out of all odd parity codewords. The idea is that *w*(*t*) should have maximum correlation with *s*0(*t*) if the information data is 0 compared to any of the other 2*<sup>k</sup>* − 1 codewords. Conversely if the information data is 1, *w*(*t*) should have maximum correlation with *s*1(*t*) compared to any of the other 2*<sup>k</sup>* − 1 codewords. As there is a minimum Hamming distance of *dmin*, between codewords, this prevents small levels of additional noise or interference causing an error in detecting the data in the decoder.

As an example, consider a typical sequence of 47 Gaussian noise samples *v*(*t*) as shown in Fig. 20.1. A binary quadratic residue [4] code, described in Chap. 4, the (47, 24, 11) code is used and the highest correlation codeword having even parity is determined using a near maximum likelihood decoder, the modified Dorsch decoder described in Chap. 15. The waveform of Fig. 20.1 is input to the decoder. The highest correlation codeword, which has a correlation value of 20.96 is the codeword:

{0 1 0 0 1 0 0 1 1 0 0 1 1 0 0 1 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0}

The highest correlation, odd parity codeword, is then determined. This codeword, which has a correlation value of 22.65, is the codeword:

{ 111010010110000110011001000100000100 00100000000 }

**Fig. 20.1** Noise waveform to be impressed with data

It should be noted that in carrying out the correlations, codeword 1's are mapped to −1's and codeword 0's are mapped to +1's.

The information bit to be impressed on the noise waveform is say, data 0, in which case the watermarked waveform *w*(*t*) needs to produce a maximum correlation with an even parity codeword. Correspondingly, the value given to *k*<sup>0</sup> is 0.156 and to *k*<sup>1</sup> is 0.02in order to make sure that the codeword which produces the maximum correlation with the marked waveform is the previously found even parity codeword:

{0 1 0 0 1 0 0 1 1 0 0 1 1 0 0 1 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0}

The marked waveform *w*(*t*) is as shown in Fig. 20.2. It may be observed that the difference between the marked wavefordm and the original waveform is small. In the decoder it is found that the codeword with highest correlation with the marked waveform *w*(*t*) is indeed the even parity codeword:

#### {0 1 0 0 1 0 0 1 1 0 0 1 1 0 0 1 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 }

and this codeword has a correlation of 28.31.

One advantage of this watermarking system over conventional communications is that the watermarked waveform may be tested using the decoder. If there is insufficient margin, adjustments may be made to the variables *k*<sup>0</sup> and *k*<sup>1</sup> and a new

**Fig. 20.2** Noise waveform impressed with data 0

watermarked waveform produced. Conversely, if there is more than adequate margin, adjustments may be made to the variables *k*<sup>0</sup> and *k*1, so that there is less degradation to the original waveform *v*(*t*).

The highest correlation, odd parity codeword with correlation 25.31 is the codeword:

#### { 111010010110000110011001000100000100 00100000000 }

It should be noted that this odd parity codeword is the same odd parity codeword as determined in the encoder, but this is not always the case depending upon the choice of values for *k*<sup>0</sup> and *k*1.

For the case where the information bit is a 1, the marked waveform *w*(*t*) needs to produce a maximum correlation with an odd parity codeword. In this case, the value of *k*<sup>0</sup> is 0.043 and the value of *k*<sup>1</sup> is 0.02 and *s*(*t*) = *k*0*s*1(*t*) − *k*1*s*0(*t*). The marked waveform *w*(*t*) is as shown in Fig. 20.3. This time in the decoder it is found that the codeword with highest correlation with *w*(*t*) is indeed the odd parity codeword:

#### { 1 1 1, 0 1 0 0 1 0 1 1 0 0 0 0 1 1 0 0 1 1 0 0 1 0 0 0 1 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 }

and this codeword has a correlation of 24.70. The highest correlation,even parity, codeword has a correlation of 22.02.

In the encoding and decoding procedure above, the maximum correlation codeword needs to be determined. For short codes a maximum likelihood decoder [6]

**Fig. 20.3** Noise waveform impressed with data 1

may be used. For medium length codes, up to 200 bits long, the near maximum likelihood decoder, the modified Dorsch decoder of Chap. 15 is the best choice. For longer codes, decoders such as an LDPC decoder [3], turbo code decoder [1], or turbo product code decoder [7] may be used in conjunction with the appropriate iterative decoder. An example of a decoder for LDPC codes is given in by Chen [2].

Once the maximum correlation codeword has been found, codewords with similar, high correlation values, may be found from the set of codewords having small Hamming distance from the highest correlation codeword. Linear codes are the most useful codes because the codewords with high correlations with the target waveform are given by the sum of the highest correlation codeword and the low-weight codewords of the code, modulo *q*, (where *GF*(*q*) is the base field [4] of the code). The low-weight codewords of the code are fixed and may be derived directly as described in Chaps. 9 and 13, or determined from the weight distribution of the dual code [4].

For practical implementations, the most straightforward approach is to restrict the codes to binary codes less than 200 bits long and determine the high correlation codewords by means of the modified Dorsch decoder. This conveniently, can output a ranked list of the high cross correlation codewords is together with their correlation values. It is straightforward to modify the decoder so as to provide the output codewords in odd and even parity classes, with the maximum correlation codeword for each class. The results for the example above were determined in this way.

Additional information may be impressed upon the independent signal or noise by partitioning the 2*<sup>k</sup>* codewords into more disjoint classes (other than binary). For example four disjoint classes may be obtained by partitioning the codewords according to odd and even parity for the odd numbered codeword bits and odd and even parity for the even numbered codeword bits. Namely, if the codewords are represented as:

$$c(\mathbf{x}) = c\_0 + c\_1 \mathbf{x} + c\_2 \mathbf{x}^2 + c\_3 \mathbf{x}^3 + c\_4 \mathbf{x}^4 + \dots + c\_{k-1} \mathbf{x}^{k-1} \tag{20.5}$$

then the codewords are partitioned according to the values of *p*<sup>0</sup> and *p*<sup>1</sup> given by

$$\begin{aligned} p\_0 &= c\_0 + c\_2 + c\_4 + c\_6 \cdots + c\_{k-1} \text{ modulo } 2 = 0\\ p\_1 &= c\_1 + c\_3 + c\_5 + c\_7 \cdots + c\_{k-2} \text{ modulo } 2 = 0 \end{aligned}$$

or with the result

$$p\_0 = c\_0 + c\_2 + c\_4 + c\_6 \cdots + c\_{k-1} \text{ modulo } 2 = 1$$

$$p\_1 = c\_1 + c\_3 + c\_5 + c\_7 \cdots + c\_{k-2} \text{ modulo } 2 = 1$$

Clearly the procedure may be extended to *m* parity bits by partitioning the 2*<sup>k</sup>* codewords into 2*<sup>m</sup>* disjoint classes. In this case, following encoding, *m* bits of information will be conveyed by the marked waveform and determined from the codeword which has the highest correlation with the marked waveform. This is by virtue of which of the 2*<sup>m</sup>* classes this codeword resides.

An alternative to this procedure is to use non-binary codes [5] with a base field of *GF*(*q*) as described in Chap. 7. For convenience a base field of *GF*(2*<sup>m</sup>*) may be used so that each symbol of a codeword is represented by *m* bits. In this case codewords are partitioned into 2*<sup>m</sup>* classes according to the value of the overall parity sum:

$$p\_0 = c\_0 + c\_1 + c\_2 + c\_3 + c\_4 + \dots + c\_{k-1} \text{ modulo } 2^m \tag{20.6}$$

The *n* non-binary symbols of each codeword may be mapped into *n* Pulse Amplitude Modulation (PAM) symbols [6] or into *n*.*m* binary symbols or a similar hybrid combination before correlation with *v*(*t*).

Rather than maximum correlation with the waveform to be marked, codewords may be chosen that have near zero correlation with the waveform to be marked. Information is conveyed by the watermarked marked waveform by the addition of a codeword to *v*(*t*), which is orthogonal or near orthogonal to the codeword which has maximum correlation to the independent signal or noise waveform *v*(*t*). In this case, the codeword with maximum correlation to *v*(*t*) is denoted as *smax*(*t*). Codewords that are orthogonal or near orthogonal to *smax*(*t*) are denoted as *smax*,*<sup>i</sup>*(*t*) for *i* = 1 to 2*<sup>m</sup>*. The signal impressed upon *v*(*t*) is:

$$\mathbf{s}(t) = k\_0 \mathbf{s}\_{\text{max}}(t) + k\_1 \mathbf{s}\_{\text{max},\eta}(t) \tag{20.7}$$

where η determines which one of the 2*<sup>m</sup>* orthogonal codewords is impressed on the waveform to convey the *m* bits of information data. The addition of the maximum correlation codeword *k*0*smax*(*t*)to *v*(*t*)is to make sure that*smax*(*t*)is still the codeword with maximum correlation after the waveform has been marked. Although the codewords *smax*,*<sup>i</sup>*(*t*) for *i* = 1 to 2*<sup>m</sup>* are orthogonal to *k*0*smax*(*t*) they are not necessarily orthogonal to *v*(*t*). In this case, the signal impressed upon *v*(*t*) needs to be:

$$s(t) = k\_0 s\_{\text{max}}(t) + \sum\_{i=1}^{2^n} k\_i s\_{\text{max},i}(t) \tag{20.8}$$

The coefficients *ki* will usually be small in order to produce near zero correlation of the codewords *smax*,*<sup>i</sup>* with *w*(*t*) except for the coefficient *kj* in order to produce a strong correlation with the codeword *smax*,*<sup>j</sup>*.

The choice of the multiplicative constants, *k*<sup>0</sup> and *k*<sup>1</sup> or the multiplicative constants *ki* for the general case (these adjust the energy of the components of the information signal), depends upon the expected levels of additional noise or interference and acceptable levels of decoder error probability. If the marked signal to noise ratio is represented as *SNRz*, the marked signal energy as *Ez*, and the difference in highest correlation to next highest correlation of the codewords is Δ*c*, then the probability of decoder error *p*(*e*) is lower bounded by:

$$p(e) \leqslant \frac{1}{2} \text{erfc}\left(\frac{\Delta\_c^2.SNR\_\varepsilon}{8.E\_\varepsilon}\right)^{0.5} \tag{20.9}$$

This is under the assumption that there is only one codeword close in Euclidean distance to the maximum correlation codeword.

The multiplicative constants may be selected "open loop" or "closed loop". In "closed loop", which is a further variation of the system, the encoding is followed by a testing phase. After encoding, the information is decoded from the marked waveform and the margin for error determined. Different levels of noise or interference may be artificially added to the marked waveform, prior to decoding, in order to assist in determining the margin for error. If the margin for error is found to be insufficient, then the multiplicative constants may be adjusted and a new marked waveform *w*(*t*) produced and tested.

In the decoder, once the maximum correlation codeword has been detected from the marked signal or noise waveform, candidate orthogonal, or near orthogonal codewords, are generated from the maximum correlation codeword and these codewords are cross correlated with the marked signal or noise waveform in order to determine which weighted orthogonal, or near orthogonal, codewords have been added to the marked signal or noise waveform. In turn the detected orthogonal, or near orthogonal, codewords from the cross correlation coefficients are used to determine the additional information which was impressed on the marked signal or noise waveform.

In order to clarify the description, Fig. 20.4 shows a block diagram of the encoder for the example of a system conveying two information bits. The independent signal

**Fig. 20.4** Encoder for two information bits using near orthogonal codewords

**Fig. 20.5** Decoder for marked waveform containing orthogonal codewords

or noise is input to a buffer memory which feeds a maximum correlation decoder, which usually will be a modified Dorsch decoder. The maximum correlation decoder has as input the error-correcting code parameters (*n*, *k*, *dmin*) and the code partition information. In this case the partition information is used to partition the codewords into four classes. The codewords, in each class, having highest correlation, and their correlation values are output as shown in Fig. 20.4. From the input data and these correlation values, the multiplicative constants are determined. The coefficients of each codeword are weighted by these constants, and added to the stored independent signal or noise to produce the marked waveform, which is output from the encoder.

Figure 20.5 shows a block diagram of the corresponding decoder. The marked waveform is input to the buffer memory which feeds a maximum correlation decoder. The error-correcting code parameters of the same (*n*, *k*, *dmin*) code and the code partition information are also input to the maximum correlation decoder. The codeword with the highest correlation is determined. The class in which the codeword resides is found and the two bits of data identifying this class are output from the decoder.

In a further approach, additional information may be conveyed by adding weighted codewords to the marked signal or noise waveform such that these codewords are orthogonal, or near orthogonal, to the codeword having maximum correlation with the marked signal or noise waveform.

#### **20.3 Summary**

This chapter has described how error-correcting codes can be used to impress additional information onto waveforms with a minimal level of distortion. Applications include watermarking and steganography. A method has been described in which the modified Dorsch decoder of Chap. 15 is used to find codewords from partitioned classes of codewords, whose waveforms may be used as a watermark which is almost invisible, and still be reliably detected.

## **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the book's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the book's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Index**

#### **C**

Code algebraic geometry (AG), 181 double-circulant, 207 extended quadratic residue, 218 extremal, 206 formally self-dual, 206 quadratic double-circulant, 222 self-dual, 206 Curve affine, 186 irreducible, 188 maximal, 189 nonsingular, 188 smooth, 188

#### **D**

Divisor, 189 effective, 190

#### **F**

Field of fractions, 190

#### **G**

Genus, 182 Group projective special linear (PSL), 211 **H** Hasse-Weil, 189

#### **M** Matrix circulant, 207

#### **P**

Place degree, 196 Plane affine, 186 projective, 186 Point at infinity, 187 singular, 188

#### **R**

Residue quadratic, 208

#### **S**

Singleton defect, 182 Space Riemann–Roch, 191

© The Author(s) 2017 M. Tomlinson et al., *Error-Correction Coding and Decoding*, Signals and Communication Technology, DOI 10.1007/978-3-319-51103-0

**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this book are included in the book's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the book's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.